[Haskell-cafe] Data.ByteStream.Char8.words performance

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Fri Mar 30 22:09:37 EDT 2007


On Fri, 2007-03-30 at 14:24 -0700, Jeremy Shaw wrote:
> Hello,
> 
> Did you compile with -O2 ? That makes a huge difference when using ByteString.

Hmm, I think we can do better than that. It would be nicer to have it
work fast without needing any -O flags at all in the user's module.

Lets look at the current def again:

words :: ByteString -> [ByteString]
words = P.filter (not . B.null) . B.splitWith isSpaceWord8
{-# INLINE words #-}

So this will always inline words into your program (when using -O or
-O2) however there is nothing really to be gained from doing that.
There's no fusion going on here, it's always going to (lazily) allocate
the result list.

So I think it's probably better to just remove the inline pragma. In
fact Dino's original program might work faster with -O0 than -O1. :-)

The best you could do with the current definition (rather than writing a
specialised implementation) is something like:

words = P.filter (not . B.null) . words'
{-# INLINE words #-}

words' = B.splitWith isSpaceWord8
{-# NOINLINE words' #-}

since the filter could fuse in the calling context with a good list
consumer but the B.splitWith is not a good producer in it's current
definition so there is no benefit to inlining it. All that gives you is
the potential to compile it badly in the calling module rather than just
calling the single compiled version in the ByteString lib (that was of
course built with -O2).

The ByteString libs was more-or-less the first high performance thing
that we wrote and we've learnt plenty more since then. I think there's a
good deal more performance too eek out of it yet, both at the low and
high level.

Duncan



More information about the Haskell-Cafe mailing list