[Haskell-cafe] [RFC] benchmarks of bytestrings, teaser

Don Stewart dons at galois.com
Sun Dec 16 00:18:55 EST 2007


firefly:
> On Sun, 2007-12-16 at 04:53 +0100, Daniel Fischer wrote:
> 
> > > Hmm. Lazy accumulator eh, on String?  Should exhibit a space leak.
> > 
> > Doesn't (with -O2, at least), seems ghc's strictness analyser did a good job.
> > It is indeed about 10* slower than ByteStrings, but very memory friendly - 
> 
> Daniel is right, there's no space leak.
> 
> Try it.  You'll get a nice surprise :)

Very nice. If we disable the strictness analsyer,

    $ ghc -O2 -fno-strictness A.hs -no-recomp -o A

We get a core loop that looks like:
        
    lgo :: Int -> [Char] -> Int
    lgo n xs = case xs of
          []     -> n
          a : as -> lgo (case a of                              -- sad dons
                             C# c1_amH -> case c1_amH of {
                                 ' '       -> case n of I# i -> I# (i +# 1)
                                 _    -> n;
                         ) as


Look at that big lazy expression for 'n' not being forced!
And when run:

        1015M 1017M onproc/1 -         0:05 22.36% A

Scary stuff.  Lots of Int thunks :)

 But enabling the strictness analyser:

    lgo :: Int# -> [Char] -> Int#
    lgo (n :: Int#) (xs :: [Char]) =
        case xs of
          []     -> n
          a : as -> case a of 
                C# c -> case c of
                      ' ' -> lgo (n +# 1) as -- makes me happy
                      _   -> lgo n        as
            
And life is good again :)

What is quite amazing is how efficient this program is.
I had to rerun this a dozen or so times, since I didn't quite believe it:

    $ time ./A < /usr/obj/data/150M 
    24569024
    ./A < /usr/obj/data/150M  2.42s user 0.47s system 100% cpu 2.883 total

Pretty stunning, I think.
Swapping in a slightly more eager structure, the lazy ByteString,

    $ time ./A < /usr/obj/data/150M
    24569024
    ./A < /usr/obj/data/150M  0.86s user 0.07s system 98% cpu 0.942 total

improves things by a good amount, but I think we can revisit the low level
performance of lazy bytestrings again, in light of all the changes to the
optimiser in the past 2 years.

-- Don


More information about the Haskell-Cafe mailing list