Bagley shootout. Was: Lightningspeed haskell

Jan Kort kort@wins.uva.nl
Fri, 02 Mar 2001 11:26:06 +0100


Simon Peyton-Jones wrote:
> 
> A String is a [Char] and a Char is a heap object. So
> a file represented as a string takes a massive 20 bytes/char
> (12 for the cons cell, 8 for the Char cell).  Then it's all sucked
> through several functions.
> 
> It's entirely possible, though, that the biggest performance hit
> is in the I/O itself.  We'd be happy if anyone wanted to invesigate
> and improve.

Unless ghc is extremely fast at filling a heap, it's the memory
allocation. I get 11.8 seconds for ghc with a standard heap and
7.3 seconds when I give it enough heap not to do garbage collection.
Since this is 200M, I don't think there is much time to do
anything else.
The input is 2000000 bytes. So, this would be 40M worth of [Char]
data, I guess lines and unlines make ehm 12*2*2=48M, so that's
about 100M total. I guess the other 100M is used for function
applications.

  Jan