Proposal: add laziness to Data.Map / IntMap

Bertram Felgenhauer bertram.felgenhauer at googlemail.com
Wed Aug 6 03:57:59 EDT 2008


Adrian Hey wrote:
> Using explicit seqs rather than strict data types is actually faster,
> for reasons that are a bit of a mystery to me. I'm not sure what cost
> Bertram is talking about, but AFAIK ghc uses the same info pointer
> mechanism for all heap records, including unevaluated thunks (although
> the info pointers will point to different things of course). But the
> cost of pattern matching on *evaluated* AVL nodes should be independent
> of strictness annotation AFAICS.

I must admit that for the time being the cost is of a theoretical
nature. But let me explain the idea.

Consider this code:

> module Nat (isOne) where
> 
> data Nat = Succ !Nat | Zero
> 
> isOne :: Nat -> Bool
> isOne n = case n of
>     Zero    -> False
>     Succ n' -> case n' of
>         Zero   -> True
>         Succ _ -> False

The code of isOne starts by forcing n (looking at n's tag and entering
the closure if it's unevaluated in ghc's case) and then a pattern
match (looking at the the tag again).

Now for the second pattern match, we can skip the first step, because
we know that n' is fully evaluated, thanks to the strictness annotation
in the Succ constructor.

However, ghc currently does generate the same (cmm) code for isOne
regardless of the strictness annotation, so performance wise you only
get to pay the price of the annotation (I expect that some thunks are
unnecessarily reevaluated when the constructor is used) without this
benefit.

Did I miss any reason why this idea can't work?

I really expected ghc to do that optimisation - obviously that was
wishful thinking on my part.

Bertram


More information about the Libraries mailing list