Computing the final representation type of a TyCon (Was: Unpack primitive types by default in data)
Johan Tibell
johan.tibell at gmail.com
Thu Nov 29 09:27:39 CET 2012
Hi all,
I've decided to try to implement the proposal included in the end of
this message. To do so I need to write a function
hasPointerSizedRepr :: TyCon -> Bool
This function would check that that the TyCon is either
* a newtype, which representation type has a pointer-sized representation, or
* an algebraic data type, with one field that has a pointer-sized
representation.
I'm kinda lost in all the data types that GHC defines to represent
types. I've gotten no further than
hasPointerSizedRepr :: TyCon -> Bool
hasPointerSizedRepr tc@(AlgTyCon {}) = case algTcRhs tc of
DataTyCon{ data_cons = [data_con] }
-> ...
NewTyCon { data_con = [data_con] }
-> ...
_ -> False
hasPointerSizedRepr _ = False
I could use some pointers (no pun intended!) at this point. The
function ought to return True for all the following types:
data A = A Int#
newtype B = B A
data C = C !B
data D = D !C
data E = E !()
data F = F !D
One part that confuses me is figuring out the representation type of a
data constructor after unpacking. For example, the function should not
return true if called on G in this example:
data G = G !H
data H = H {-# UNPACK #-} !I
data I = I !Int !Int
because if we unpacked H into G's constructor it would take up two
words, due to I being unpacked.
Does DataCon contain the unpacked representation of the data
constructor or only the before-optimizations representation?
Cheers,
Johan
On Thu, Feb 16, 2012 at 4:25 PM, Johan Tibell <johan.tibell at gmail.com> wrote:
> Hi all,
>
> I've been thinking about this some more and I think we should
> definitely unpack primitive types (e.g. Int, Word, Float, Double,
> Char) by default.
>
> The worry is that reboxing will cost us, but I realized today that at
> least one other language, Java, does this already today and even
> though it hurts performance in some cases, it seems to be a win on
> average. In Java all primitive fields get auto-boxed/unboxed when
> stored in polymorphic fields (e.g. in a HashMap which stores keys and
> fields as Object pointers.) This seems analogous to our case, except
> we might also unbox when calling lazy functions.
>
> Here's an idea of how to test this hypothesis:
>
> 1. Get a bunch of benchmarks.
> 2. Change GHC to make UNPACK a no-op for primitive types (as library
> authors have already worked around the lack of unpacking by using this
> pragma.)
> 3. Run the benchmarks.
> 4. Change GHC to always unpack primitive types (regardless of the
> presence of an UNPACK pragma.)
> 5. Run the benchmarks.
> 6. Compare the results.
>
> Number (1) might be what's keeping us back right now, as we feel that
> we don't have a good benchmark set. I suggest we try with nofib first
> and see if there's a different and then move on to e.g. the shootout
> benchmarks.
>
> I imagine that ignoring UNPACK pragmas selectively wouldn't be too
> hard. Where the relevant code?
>
> Cheers,
> Johan
More information about the Glasgow-haskell-users
mailing list