# Can cardinality analysis subsume strictness analysis

Joachim Breitner mail at joachim-breitner.de
Mon Dec 2 15:36:51 UTC 2013

Hi,

Am Freitag, den 29.11.2013, 11:54 +0000 schrieb Joachim Breitner:
> after playing a bit with the Demand analyser and trying debug tricky
> bugs (#8569), I begin to wonder: Can cardinality analysis subsume
> strictness analysis?

this is a more structured attempt to answer that question.

The motivation for this proposal is that I find it easier to think about
what the lubDemand, bothDemand etc. should be if I mostly have to think
about how often something is called (type Count below), instead of
thinking about Demand&Cardinality and Strictness (which feel very
similar to me) separately.

Currently we have these data structures (Demand.lhs, slightly simplified
by removing field names and type aliases):

data DmdType = DmdType (VarEnv JointDmd) [Demand] DmdResult
data CPRResult = NoCPR | RetProd | RetSum ConTag | BotCPR

data Demand = JD MaybeStr MaybeUsed
data MaybeStr = Lazy | Str StrDmd
data StrDmd = HyperStr | SCall | SProd [MaybeStr] | HeadStr

data MaybeUsed = Abs | Use Count UseDmd
data UseDmd = UCall Count UseDmd | UProd [MaybeUsed] | UHead | Used
data Count = One | Many

I suggest the following date type to replace the latter six above:

data CDemand = BotDemand | Use Count Demand -- C for cardinality
data Demand =  Call Count Demand | Prod [CDemand] | PolyUse | TopDemand
data Count = NoIdea | Zero | AtMostOne | AtLeastOne | ExactlyOne
-- interesting intervals of {0, 1, ∞}, corresponding to
{0,1,∞} {0}     {0,1}       {1,∞}        {1}

Invariants and equations:
* "Use Zero x" implies x = TopDemand. Therefore in the following:
useZero = useZero TopDemand
* TopDemand = Prod [Use NoIdea TopDemand,...] and
TopDemand = Call (Use NoIdea TopDemand), but as ever we can use the
latter form for piggy-backed boxity analysis.
* BotDemand is the demand put on something that diverges for sure. So
it is not only the “bottom of the demand lattice” but also “the
demand of something bottoming” – either way of thinking works.

And here is a transformation from old to new:

combine1 :: MaybeStr -> MaybeUsed      -> CDemand
Lazy Abs                              -> useZero
Lazy (Use c d)                        -> Use c' (convertLazy d)
where c' = case c of One ->  AtMostOne
Many -> NoIdea
(Str HyperStr) Abs                    -> BotDemand
(Str HyperStr) _                      -> – not representable – (Note 1)
(Str _) Abs                           -> – not representable – (Note 2)
(Str s) (Use c d)                     -> Use c' (combine2 s d)
where c' = case c of One ->  ExactlyOne
Many -> AtLeastOne

convertLazy :: UseDMD                  -> Demand
UCall c d                             -> Call c' (convertLazy d)
where c' = case c of One -> AtMostOne -- see Note 4
Many -> NoIdea
UProd ds                              -> UProd (map convertLazy ds)
UHead                                 -> PolyUse useZero
Used                                  -> TopDemand

combine2 :: StrDmd -> UseDMD           -> Demand
SCall      (UCall Many d)             -> Call NoIdea (convertLazy d)
SCall      (UCall One d)              -> Call (AtMostOne (convertLazy d))
SCall      (UProd _)                  -> – not representable – (Note 3)
SCall      (UHead _)                  -> PolyUse
SCall      Used                       -> Call TopDemand
(SProd ss) (UCall _ _)                -> – not representable – (Note 3)
(SProd ss) (UProd ds)                 -> Prod (zipwith combine1 ss ds)
(SProd ss) (UHead _)                  -> Prod (zipwith combine1 ss (repeat Abs))
(SProd ss) Used                       -> Prod (zipwith combine1 ss (repeat Used))
HeadStr    (UCall Many d)             -> Call NoIdea (convertLazy d)
HeadStr    (UCall One d)              -> Call (AtMostOne (convertLazy d)
HeadStr    (UProd ds)                 -> Prod (zipwith combine1 (repeat Lazy) ds)
HeadStr    Used                       -> PolyUse
HyperStr   _                          -> – dead code – (handled by combine1)

Note 1: About HyperStr
As far as I can see, this strictness demand is the one caused by
diverging calls (error or non-termination), and is only relevant
when it comes with a Use demand of Abs
Note 2: Strict but absent
Besides diverging things (who currently put a hyper-strict &
absent demand on free variables), nothing can be strict and
absent, right?
Note 3: Disagreeing signatures
When the strictness side sees a Call, the the demand side better
not see a Product, and vica versa
Note 4: The Count in Call
is currently either AtMostOne or NoIdea, i.e. there is notion of
“function is not called at all”. Does that correctly reflect
what we have right now?

And here is the lattice. The nice thing is that there is a lattice for
Counts that is mostly independent from the other lattice, at least in
terms of thinking about it.

NoIdea
/  \
/    \
AtMostOnce   AtLeastOnce
/   \        /
Zero  ExactlyOne

TopDemand
/      \
Use c Call  Use c Prod
\      /
Use c PolyUse
|
BotDemand

Some thoughts:

The lattices (both existing and proposed) are mostly complicated by two
special cases:
* bottoming stuff with hyperstrict demand and
* polymorphic seq.

As mentioned above, (HyperStr, Abs), i.e. the demand of error, becomes
BotDemand. We have no way of representing (HyperStr, Used), for example.
Would that occur?

For seq, I propose to use PolyUse (for “unknown polymorphic use”). This
goes well the current use of (HeadStr, UHead) for seq. The strange is,
though, that HeadStr lives _above_ SCall und SProd, while UHead lives
_below_ UCall and UProd. This means that currently,
case b of True -> x `seq` ()
False -> case x of (a,b) -> ()
has a joint demand of d=(HeadStrict, UProd [Abs, Abs]) on x – why not
d'=(SProd [Lazy, Lazy], UProd [Abs, Abs])? Anyways, in my lattice both
of these demands correspond to (Prod [absZero, absZero]), i.e. my
proposal hides the current differences. Are these wanted differences or
unwanted ones?

Some advantages of the code are:
* Less duplication between StrDemand and UseDemand.
* Non-redundant information about what is a function and what is a
product.
* Functions like cleanDemand simply take out Demand from CDemand, and
the Count is the information that one needs to pass to deferAndUse,
so less logic and less cases there.

So my proposal is not an isomorphism of the existing lattice (as
expected), and can neither express all existing combinations, nor
differentiate between all of them. Which is a good thing, iff those
values and differences are not desired.

Surely more interesting corner cases will turn up when and if one tries
to employ this scheme. Also, of course, there is still improvement for
further polishing. The question is: Is this worth pursuing, or is there
a reason why this cannot work, or maybe it would but it would not be an
improvement?

Thanks,
Joachim

--
Joachim “nomeata” Breitner
mail at joachim-breitner.dehttp://www.joachim-breitner.de/
Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0x4743206C
Debian Developer: nomeata at debian.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part