Strict foldl

Julian Seward (Intl Vendor) v-julsew@microsoft.com
Fri, 7 Dec 2001 02:03:13 -0800


| > Prelude> sum [1..10000000]
| >=20
| > had the following effect:
| >=20
| >   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM  TIME COMMAND
| > 23542 herrmann  20   0  250M 130M 97500 R    66.3 52.4   0:21
ghc-5.02
|=20
| Is this what I think it is?  Do you benchmark the
| interpreter?  Interpreted code isn't optimised.  When I
| compile
|=20
|   main =3D print $ sum [1..10000000]
|=20
| with -O2, it takes 13s on a 600MHz P3 and runs in 1.5MB of
| space.

I should point out that the interpreter's (byte-)code generator
is simplistic, and does not emit code to stub (make NULL) stack slots=20
which are known to be dead.  This is probably causing the=20
space behaviour you are seeing.  In contrast the compiled-code
back ends do do stack stubbing.  If anyone wants to rewrite the
byte code generator to keep track of stack slot liveness and
emit suitable stub insns, that would be a service to us all :)

[constant folding]

| The natice code generator probably also performs some of
| those optimisations, but I am not sure exactly which.

Yes, it tries quite hard to fold literal trees and usually=20
succeeds well enough that the generated code is pretty much
identical to that emitted by gcc.

J