[commit: ghc] master: Make tagForCon non-linear (faf60e8)

Simon Peyton Jones simonpj at microsoft.com
Fri Oct 27 23:30:42 UTC 2017


| I considered that, but I'm shy at changing data types. Especially for
| caching, because sometimes keeping the data in sync can be difficult.

In this case, not so.  Just make a smart contructor for DataTyCon and use it.  There are very few calls to it anyway!

As to the storage cost this is one word per data type.  There aren't that many data types!  (Compared to how many of everything else there is.)

Performance should be better -- no need to count the list repeatedly.

| Let me piggy-back a question for a follow-up diff on this.
| 
| Another bottleneck is in mkDataCon. For each data constructor we do [1]:
| 
|   tag = assoc "mkDataCon" (tyConDataCons rep_tycon `zip` [fIRST_TAG..])
| con
| 
| The easiest way to fix this was to cache the tag allocation (using
| UniqFM) in DataTyCon and SumTyCon.

I don’t like this.  Unlike the family size, the tag-map would be used in just one place, when building the data constructors.  Yes it'd be an extra argument to pass, but it's a local temporary thing, just during data type construction, whereas adding a field in DataTyCon stays forever.

To pass it around
* TcTyClsDecls.tcConDecls constructs a tag-map (NameEnv Tag) from the
  knot-tied tycoon
* Passes the tag-map to tcConDecl
* Which passes it to buildDataCon
* Which looks up the tag and passes it to mkDataCon

Not hard.

For the family size though, just cache it in DataTyCon.

Simon

| This is all in a knot, but it works out in this case.
| 
| Does that sound like a sound approach?
| An alternative would be to refactor the code to pass the tags down, but
| that looked harder.
| 
| Unfortunately going this route doesn't help with determining the size,
| because checking the size of UniqFM is linear.
| 
| 
| [1]
| https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/basi
| cTypes/DataCon.hs;faf60e858a293affca463043c830e1edb5685003$921
| 
| Thanks,
| Bartosz
| 
| 2017-10-27 15:06 GMT-07:00 Simon Peyton Jones <simonpj at microsoft.com>:
| > Bartosz
| >
| > Why not just cache the #datacons in the TyCon (in the DataTyCon
| constructor).  That seems simple, direct, and fast.
| >
| > And simpler than the stuff you are forced into here.
| >
| > Simon
| >
| > | -----Original Message-----
| > | From: ghc-commits [mailto:ghc-commits-bounces at haskell.org] On Behalf
| > | Of git at git.haskell.org
| > | Sent: 27 October 2017 23:03
| > | To: ghc-commits at haskell.org
| > | Subject: [commit: ghc] master: Make tagForCon non-linear (faf60e8)
| > |
| > | Repository : ssh://git@git.haskell.org/ghc
| > |
| > | On branch  : master
| > | Link       :
| > | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fghc.
| > | haske
| > | ll.org%2Ftrac%2Fghc%2Fchangeset%2Ffaf60e858a293affca463043c830e1edb5
| > | 68500
| > | 3%2Fghc&data=02%7C01%7Csimonpj%40microsoft.com%7C677da6a867674476306
| > | 508d5
| > | 1d869247%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63644738617630
| > | 3711&
| > | sdata=tp4PTSkN5iBwKkWgZdT%2BcHKK8EKz8NQZeE%2FegVdx%2FIM%3D&reserved=
| > | 0
| > |
| > | >---------------------------------------------------------------
| > |
| > | commit faf60e858a293affca463043c830e1edb5685003
| > | Author: Bartosz Nitka <bnitka at fb.com>
| > | Date:   Fri Oct 20 20:30:52 2017 +0100
| > |
| > |     Make tagForCon non-linear
| > |
| > |     Computing the number of constructors for TyCon is linear
| > |     in the number of constructors.
| > |     That's wasteful if all you want to check is if that
| > |     number is smaller than what fits in tag bits
| > |     (usually 8 things).
| > |
| > |     What this change does is to use a function that can
| > |     determine the ineqaulity without computing the size.
| > |
| > |     This improves compile time on a module with a
| > |     data type that has 10k constructors.
| > |     The variance in total time is (suspiciously) high,
| > |     but going by the best of 3 the numbers are 8.186s vs 7.511s.
| > |     For 1000 constructors the difference isn't noticeable:
| > |     0.646s vs 0.624s.
| > |     The hot spots were cgDataCon and cgEnumerationTyCon
| > |     where tagForCon is called in a loop.
| > |
| > |     One alternative would be to pass down the size.
| > |
| > |     Test Plan: harbormaster
| > |
| > |     Reviewers: bgamari, simonmar, austin
| > |
| > |     Reviewed By: simonmar
| > |
| > |     Subscribers: rwbarton, thomie
| > |
| > |     Differential Revision: https://phabricator.haskell.org/D4116
| > |
| > |
| > | >---------------------------------------------------------------
| > |
| > | faf60e858a293affca463043c830e1edb5685003
| > |  compiler/codeGen/StgCmmClosure.hs | 11 ++++++++---
| > |  compiler/types/TyCon.hs           | 16 +++++++++++++++-
| > |  2 files changed, 23 insertions(+), 4 deletions(-)
| > |
| > | diff --git a/compiler/codeGen/StgCmmClosure.hs
| > | b/compiler/codeGen/StgCmmClosure.hs
| > | index 1da1f70..2501ec9 100644
| > | --- a/compiler/codeGen/StgCmmClosure.hs
| > | +++ b/compiler/codeGen/StgCmmClosure.hs
| > | @@ -361,13 +361,18 @@ type DynTag = Int       -- The tag on a
| *pointer*
| > |  isSmallFamily :: DynFlags -> Int -> Bool  isSmallFamily dflags
| > | fam_size = fam_size <= mAX_PTR_TAG dflags
| > |
| > | +-- | Faster version of isSmallFamily if you haven't computed the
| > | +size
| > | yet.
| > | +isSmallFamilyTyCon :: DynFlags -> TyCon -> Bool isSmallFamilyTyCon
| > | +dflags tycon =
| > | +  tyConFamilySizeAtMost tycon (mAX_PTR_TAG dflags)
| > | +
| > |  tagForCon :: DynFlags -> DataCon -> DynTag  tagForCon dflags con
| > | -  | isSmallFamily dflags fam_size = con_tag
| > | -  | otherwise                     = 1
| > | +  | isSmallFamilyTyCon dflags tycon = con_tag
| > | +  | otherwise                       = 1
| > |    where
| > |      con_tag  = dataConTag con -- NB: 1-indexed
| > | -    fam_size = tyConFamilySize (dataConTyCon con)
| > | +    tycon = dataConTyCon con
| > |
| > |  tagForArity :: DynFlags -> RepArity -> DynTag  tagForArity dflags
| > | arity diff --git a/compiler/types/TyCon.hs b/compiler/types/TyCon.hs
| > | index
| > | 39d2e9b..103c824 100644
| > | --- a/compiler/types/TyCon.hs
| > | +++ b/compiler/types/TyCon.hs
| > | @@ -78,7 +78,7 @@ module TyCon(
| > |          tyConDataCons, tyConDataCons_maybe,
| > |          tyConSingleDataCon_maybe, tyConSingleDataCon,
| > |          tyConSingleAlgDataCon_maybe,
| > | -        tyConFamilySize,
| > | +        tyConFamilySize, tyConFamilySizeAtMost,
| > |          tyConStupidTheta,
| > |          tyConArity,
| > |          tyConRoles,
| > | @@ -2205,6 +2205,20 @@ tyConFamilySize tc@(AlgTyCon { algTcRhs = rhs
| })
| > |        _                              -> pprPanic "tyConFamilySize 1"
| > | (ppr tc)
| > |  tyConFamilySize tc = pprPanic "tyConFamilySize 2" (ppr tc)
| > |
| > | +-- | Determine if number of value constructors a 'TyCon' has is
| > | +smaller
| > | +-- than n. Faster than tyConFamilySize tc <= n.
| > | +-- Panics if the 'TyCon' is not algebraic or a tuple
| > | +tyConFamilySizeAtMost  :: TyCon -> Int -> Bool
| > | +tyConFamilySizeAtMost tc@(AlgTyCon { algTcRhs = rhs }) n
| > | +  = case rhs of
| > | +      DataTyCon { data_cons = cons } -> lengthAtMost cons n
| > | +      NewTyCon {}                    -> 1 <= n
| > | +      TupleTyCon {}                  -> 1 <= n
| > | +      SumTyCon { data_cons = cons }  -> lengthAtMost cons n
| > | +      _                              -> pprPanic
| "tyConFamilySizeAtMost
| > | 1"
| > | +                                          (ppr tc)
| > | +tyConFamilySizeAtMost tc _ = pprPanic "tyConFamilySizeAtMost 2"
| > | +(ppr
| > | +tc)
| > | +
| > |  -- | Extract an 'AlgTyConRhs' with information about data
| > | constructors from an
| > |  -- algebraic or tuple 'TyCon'. Panics for any other sort of 'TyCon'
| > |  algTyConRhs :: TyCon -> AlgTyConRhs
| > |
| > | _______________________________________________
| > | ghc-commits mailing list
| > | ghc-commits at haskell.org
| > | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
| > | .hask
| > | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
| > | commits&data=02%7C01%7Csimonpj%40microsoft.com%7C677da6a867674476306
| > | 508d5
| > | 1d869247%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63644738617631
| > | 3720&
| > | sdata=NfTqei3XryWnCiNRgOZDT6XfMRMfAST7EKFBSi%2FCXa8%3D&reserved=0


More information about the ghc-devs mailing list