[Haskell-cafe] OCaml list sees abysmal Language Shootout results

Sam Mason mason at f2s.com
Wed Sep 29 20:02:54 EDT 2004


Greg Buchholz wrote:
>The algorithm isn't correct (it counts spaces instead of words), but
>anyone have advice for improving its performance?

You probably want some strictness annotations in there. . .  When I
tried the same thing, I came up with something like:

> import Char;
>
> cclass c | isSpace c = (c == '\n', False)
>          | otherwise = (False    , True)
>
> data Cdata = Cdata !Bool !Int !Int !Int
>   deriving Show
>
> combine (Cdata last l w c) (nl, iw) = Cdata iw l' w' (c + 1)
>     where l' = if nl             then l + 1 else l
>           w' = if not last && iw then w + 1 else w
>
> wc = foldl combine (Cdata False 0 0 0) . map cclass
>
> main = getContents >>= putStrLn . show . wc

It seems to work in constant stack space, and gives the same answers
(albeit not very beautifully) as my GNU copy of "wc".

>Is the problem
>caused by the laziness of the Int's inside the tuple?

I'm pretty sure that's what's causing it.  I had quite a search around
when my version was running out of memory and everything seemed to
suggest that, basically, the algorithm is building a massive list of
"+1"s that only actually get executed when the you try and print the
totals at the end.

Any comments from more authoritative sources?

>Here is the
>report from ghc with the '-ddump-simpl' option.

If anyone has any hints about how to read this output, I'd be
grateful.  It makes a bit of sense, but I don't really know what it
"means".  I.e. it's obviously the simplified parse tree and I can see
how it relates back to the source (loosely), but attempting to figure
out if something's going to be as leaky as a sieve or not is beyond
me.


More information about the Haskell-Cafe mailing list