Generalising the demand analysis to sum types

Wed Feb 17 04:50:25 UTC 2016

Hello ghc-devs,

Apologies for the longish email.

I'm looking to generalise GHC's demand analysis to work on sum types.
This is in connection with the work on unboxed sums. The idea is that
if the strictness analysis tells us that a sum type is strict in all
the fields of all of its constructors, it is safe to unbox that sum
automatically. We hope for this to feed into a worker/wrapper like
transformation for sum types.

I am new to the GHC codebase and wanted to make sure that my plan of
attack seemed sensible to you all.

GHC's current type representing demand is StrDmd, which is defined as follows:

data StrDmd
    = HyperStrict
    | SCall StrDmd
    | SProd [ArgStr]
    | HeadStr

I believe that adding sum types will be as simple as adding a single
new constructor for Sums:

data StrDmd
    = HyperStrict
    | SCall StrDmd
    | SProd [ArgStr]
    | SSum [(Tag, StrDmd)]
    | HeadStr

We need the constructor tag so that we can analyze each branch of a
case expression independently before combining the results of each
branch. Since we only care if all fields are strict for all
constructors, we won't actually use the tag information except in the
analysis itself.

Projection-based strictness analysis on sum types is not new (though
sum types + higher-order seems to be). That being said, all previous
treatments of the subject that I'm aware of forbid polymorphic
recursion. Luckily for us we aren't trying to unbox recursive types,
so for now [1] we will not attempt to analyze recursive types, hence
no SMu or SRec constructor above.

With the analysis accepting sum types we will be able to analyze
functions with sum types as parameters, as a trivial example:

fromMaybe2 x y = case x of
        Just v -> v
        Nothing -> y

You would get a demand signature like:

Str=Dmdtype <S[("Just", <S,U>), ("Nothing", <>)],U><L,U>

Which says that the function is strict in its first argument and that
if the value is a 'Just' then its field is used strictly, if the value
is a 'Nothing' then there is no further demand (a nullary product).

For those with more experience in these matter, does this seem to be a
sensible approach?

Thanks in advance for any insight,

Jose

[1]: Those of you who saw my Haskell Symposium talk on implicit
parallelism last year might remember that we will need the analysis to
work on recursive types in order to automatically generate parallel
strategies, but that's for later (I've spoken with Ilya Sergey about
working on that).