evalList, seqList

Thu Dec 1 10:50:24 CET 2011

Dear libraries,

I am experimenting with the attached code, where right before the
consumption by Vector.fromList, the list is unevaluated, and each
element has a small type (a Double), but if left unevaluated before the
list is traversed further, will hold on to a rather large portion of
memory. I want to use the example in a talk about efficient Haskell next
week.

After consumption by (unboxed) Vector.fromList, the list has been
traversed, but every element in the vector is still a huge thunk, thus
holding on to 429MB maximum residency. If I just deepseq the hole list
before converting it to a vector, the memory residency becomes 140MB,
about what can be expected for a unboxed Vector Double with 3664891
entries plus the [Double] with 3664891 entries.

The point of the exercise is to show that we get the best performance by
evaluating the list value-strict, i.e. evaluate the head of the list
before continuing, but still allowing the GC to remove the part of the
list has has already been consumed by fromList, yielding a constant
space behavior (not counting the final vector). This is actually
motivated by a real-world experience of mine. So if I use this function:

        valueStrictList :: [a] -> [a]
        valueStrictList [] = []
        valueStrictList (x:xs) = x `seq` (x:valueStrictList xs)

before fromList, the maximum residency drops to 67MB, which is pretty
good. (58MB is the pure cost of the resulting vector).

Now I’d like to get the same result with the standard functions from
Control.Seq, instead of rolling my own. I did not look in
Control.Parallel.Strategies, as I am not interested in parallelization
at this point. So I tried

        Seq.withStrategy (Seq.seqList Seq.rseq)

but this has the same effect as deepseq (140MB), which is probably what
was intended, but not what I need. So I did look at
Control.Parallel.Strategies and found evalList, which sounds like what I
want, but

        Strat.withStrategy (Strat.evalList Strat.rseq)

caused a Stack space overflow, and even using the same function as
before, but now from Strategies, i.e.

        Strat.withStrategy (Strat.seqList Strat.rseq)

causes a stack overflow.

My questions are:
     1. Why is there no evalList in Control.Seq? Should it be there?
     2. Why does seqList behave differently in
        Control.Parallel.Strategies and in Control.Seq? Is that a bug?

The example code is attached, and the various options are easily tried
out by commenting out the apropriate line. I’m using ghc-7.0.4 and
parallel-3.1.0.1, and compiled the code with "-O2".

Greetings,
Joachim

-- 
Joachim "nomeata" Breitner
  mail at joachim-breitner.de  |  nomeata at debian.org  |  GPG: 0x4743206C
  xmpp: nomeata at joachim-breitner.de | http://www.joachim-breitner.de/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: collatz.hs
Type: text/x-haskell
Size: 1790 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/libraries/attachments/20111201/8cba2926/attachment-0001.hs>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/libraries/attachments/20111201/8cba2926/attachment-0001.pgp>