[Haskell-cafe] Efficient string output

Ketil Malde ketil at malde.org
Mon Feb 9 06:49:05 EST 2009


I'm currently working on a program that parses a large binary file and
produces various textual outputs extracted from it.  Simple enough.

But: since we're talking large amounts of data, I'd like to have
reasonable performance.  

Reading the binary file is very efficient thanks to Data.Binary.
However, output is a different matter.  Currently, my code looks
something like:

      summarize :: Foo -> ByteString
      summarize f = let f1 = accessor f
                        f2 = expression f
                    in B.concat [f1,pack "\t",pack (show f2),...]

which isn't particularly elegant, and builds a temporary ByteString
that usually only get passed to B.putStrLn.  I can suffer the
inelegance were it only fast - but this ends up taking the better part
of the execution time.

I tried to use lazy ByteStrings, the theory being that the components
that already are (strict) ByteStrings could be recycled as chunks.  I
also tried to push the output down into the function 
(summarize :: Foo -> IO ()), but both of these were actuall slower.

Since I surely can't be the first person that needs to output
tab-separated text, I'd be grateful if somebody could point me in the
right direction. 

If I haven't seen further, it is by standing in the footprints of giants

More information about the Haskell-Cafe mailing list