[Haskell-cafe] Is Haskell capable of matching C in string processing performance?

Gregory Crosswhite gcross at phys.washington.edu
Fri Jan 22 18:09:21 EST 2010


I would say that counts as cheating because it assumes that knowledge of the input in advance.  However, I wonder how it would perform if there were a "reChunk" function that lazily built a new lazy ByteString by merging smaller chunks together --- i.e., it would keep pullings chunks from the ByteString until it reached some threshold size, merge them into a single strict ByteString chunk, and then recursively continue processing the rest of the lazy ByteString in this manner.

Cheers,
Greg


On Jan 22, 2010, at 7:30 AM, Tom Nielsen wrote:

>> It seems to me this indicates that the big expense here is the call into the I/O system.
> 
> So let's make fewer I/O calls:
> 
> import Control.Monad
> import qualified Data.ByteString.Char8 as S
> import System.IO
> 
> null_str1 = S.concat $ take 1000 $ repeat $ S.pack "null"
> 
> n1 = 5000000 `div` 1000
> 
> main = withBinaryFile "out3.json" WriteMode $ \h -> do
> hPutStr h "["
> replicateM_ n1 (S.hPutStr h null_str1)
> hPutStr h "]"
> ---
> this is 10x faster. Whether this is cheating or not depends on what
> John actually wants to do.
> 
> Tom



More information about the Haskell-Cafe mailing list