[commit: ghc] master: Make tagForCon non-linear (faf60e8)

Bartosz Nitka niteria at gmail.com
Fri Oct 27 22:40:56 UTC 2017


Hi Simon,

I considered that, but I'm shy at changing data types. Especially for
caching, because sometimes keeping the data in sync can be difficult.
This was a safer way to approach the problem, and I believe this part
of the code is no longer the problem.
I can try the caching approach, but I have limited time and if I run
into problems (I'm thinking performance, technically caching should
increase allocations) I will have to abort.

Let me piggy-back a question for a follow-up diff on this.

Another bottleneck is in mkDataCon. For each data constructor we do [1]:

  tag = assoc "mkDataCon" (tyConDataCons rep_tycon `zip` [fIRST_TAG..]) con

The easiest way to fix this was to cache the tag allocation (using
UniqFM) in DataTyCon and SumTyCon.
This is all in a knot, but it works out in this case.

Does that sound like a sound approach?
An alternative would be to refactor the code to pass the tags down,
but that looked harder.

Unfortunately going this route doesn't help with determining the size,
because checking the size of UniqFM is linear.


[1] https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/basicTypes/DataCon.hs;faf60e858a293affca463043c830e1edb5685003$921

Thanks,
Bartosz

2017-10-27 15:06 GMT-07:00 Simon Peyton Jones <simonpj at microsoft.com>:
> Bartosz
>
> Why not just cache the #datacons in the TyCon (in the DataTyCon constructor).  That seems simple, direct, and fast.
>
> And simpler than the stuff you are forced into here.
>
> Simon
>
> | -----Original Message-----
> | From: ghc-commits [mailto:ghc-commits-bounces at haskell.org] On Behalf Of
> | git at git.haskell.org
> | Sent: 27 October 2017 23:03
> | To: ghc-commits at haskell.org
> | Subject: [commit: ghc] master: Make tagForCon non-linear (faf60e8)
> |
> | Repository : ssh://git@git.haskell.org/ghc
> |
> | On branch  : master
> | Link       :
> | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fghc.haske
> | ll.org%2Ftrac%2Fghc%2Fchangeset%2Ffaf60e858a293affca463043c830e1edb568500
> | 3%2Fghc&data=02%7C01%7Csimonpj%40microsoft.com%7C677da6a867674476306508d5
> | 1d869247%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636447386176303711&
> | sdata=tp4PTSkN5iBwKkWgZdT%2BcHKK8EKz8NQZeE%2FegVdx%2FIM%3D&reserved=0
> |
> | >---------------------------------------------------------------
> |
> | commit faf60e858a293affca463043c830e1edb5685003
> | Author: Bartosz Nitka <bnitka at fb.com>
> | Date:   Fri Oct 20 20:30:52 2017 +0100
> |
> |     Make tagForCon non-linear
> |
> |     Computing the number of constructors for TyCon is linear
> |     in the number of constructors.
> |     That's wasteful if all you want to check is if that
> |     number is smaller than what fits in tag bits
> |     (usually 8 things).
> |
> |     What this change does is to use a function that can
> |     determine the ineqaulity without computing the size.
> |
> |     This improves compile time on a module with a
> |     data type that has 10k constructors.
> |     The variance in total time is (suspiciously) high,
> |     but going by the best of 3 the numbers are 8.186s vs 7.511s.
> |     For 1000 constructors the difference isn't noticeable:
> |     0.646s vs 0.624s.
> |     The hot spots were cgDataCon and cgEnumerationTyCon
> |     where tagForCon is called in a loop.
> |
> |     One alternative would be to pass down the size.
> |
> |     Test Plan: harbormaster
> |
> |     Reviewers: bgamari, simonmar, austin
> |
> |     Reviewed By: simonmar
> |
> |     Subscribers: rwbarton, thomie
> |
> |     Differential Revision: https://phabricator.haskell.org/D4116
> |
> |
> | >---------------------------------------------------------------
> |
> | faf60e858a293affca463043c830e1edb5685003
> |  compiler/codeGen/StgCmmClosure.hs | 11 ++++++++---
> |  compiler/types/TyCon.hs           | 16 +++++++++++++++-
> |  2 files changed, 23 insertions(+), 4 deletions(-)
> |
> | diff --git a/compiler/codeGen/StgCmmClosure.hs
> | b/compiler/codeGen/StgCmmClosure.hs
> | index 1da1f70..2501ec9 100644
> | --- a/compiler/codeGen/StgCmmClosure.hs
> | +++ b/compiler/codeGen/StgCmmClosure.hs
> | @@ -361,13 +361,18 @@ type DynTag = Int       -- The tag on a *pointer*
> |  isSmallFamily :: DynFlags -> Int -> Bool  isSmallFamily dflags fam_size
> | = fam_size <= mAX_PTR_TAG dflags
> |
> | +-- | Faster version of isSmallFamily if you haven't computed the size
> | yet.
> | +isSmallFamilyTyCon :: DynFlags -> TyCon -> Bool isSmallFamilyTyCon
> | +dflags tycon =
> | +  tyConFamilySizeAtMost tycon (mAX_PTR_TAG dflags)
> | +
> |  tagForCon :: DynFlags -> DataCon -> DynTag  tagForCon dflags con
> | -  | isSmallFamily dflags fam_size = con_tag
> | -  | otherwise                     = 1
> | +  | isSmallFamilyTyCon dflags tycon = con_tag
> | +  | otherwise                       = 1
> |    where
> |      con_tag  = dataConTag con -- NB: 1-indexed
> | -    fam_size = tyConFamilySize (dataConTyCon con)
> | +    tycon = dataConTyCon con
> |
> |  tagForArity :: DynFlags -> RepArity -> DynTag  tagForArity dflags arity
> | diff --git a/compiler/types/TyCon.hs b/compiler/types/TyCon.hs index
> | 39d2e9b..103c824 100644
> | --- a/compiler/types/TyCon.hs
> | +++ b/compiler/types/TyCon.hs
> | @@ -78,7 +78,7 @@ module TyCon(
> |          tyConDataCons, tyConDataCons_maybe,
> |          tyConSingleDataCon_maybe, tyConSingleDataCon,
> |          tyConSingleAlgDataCon_maybe,
> | -        tyConFamilySize,
> | +        tyConFamilySize, tyConFamilySizeAtMost,
> |          tyConStupidTheta,
> |          tyConArity,
> |          tyConRoles,
> | @@ -2205,6 +2205,20 @@ tyConFamilySize tc@(AlgTyCon { algTcRhs = rhs })
> |        _                              -> pprPanic "tyConFamilySize 1"
> | (ppr tc)
> |  tyConFamilySize tc = pprPanic "tyConFamilySize 2" (ppr tc)
> |
> | +-- | Determine if number of value constructors a 'TyCon' has is smaller
> | +-- than n. Faster than tyConFamilySize tc <= n.
> | +-- Panics if the 'TyCon' is not algebraic or a tuple
> | +tyConFamilySizeAtMost  :: TyCon -> Int -> Bool tyConFamilySizeAtMost
> | +tc@(AlgTyCon { algTcRhs = rhs }) n
> | +  = case rhs of
> | +      DataTyCon { data_cons = cons } -> lengthAtMost cons n
> | +      NewTyCon {}                    -> 1 <= n
> | +      TupleTyCon {}                  -> 1 <= n
> | +      SumTyCon { data_cons = cons }  -> lengthAtMost cons n
> | +      _                              -> pprPanic "tyConFamilySizeAtMost
> | 1"
> | +                                          (ppr tc)
> | +tyConFamilySizeAtMost tc _ = pprPanic "tyConFamilySizeAtMost 2" (ppr
> | +tc)
> | +
> |  -- | Extract an 'AlgTyConRhs' with information about data constructors
> | from an
> |  -- algebraic or tuple 'TyCon'. Panics for any other sort of 'TyCon'
> |  algTyConRhs :: TyCon -> AlgTyConRhs
> |
> | _______________________________________________
> | ghc-commits mailing list
> | ghc-commits at haskell.org
> | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.hask
> | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
> | commits&data=02%7C01%7Csimonpj%40microsoft.com%7C677da6a867674476306508d5
> | 1d869247%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636447386176313720&
> | sdata=NfTqei3XryWnCiNRgOZDT6XfMRMfAST7EKFBSi%2FCXa8%3D&reserved=0


More information about the ghc-devs mailing list