[Haskell-cafe] [RFC] benchmarks of bytestrings, teaser

Peter Lund firefly at vax64.dk
Sat Dec 15 03:25:05 EST 2007


What do you think the relative speeds are of the six small haskell
programs at the end of this email?

All they do is read from stdin and count the number of spaces they see.
There are two that use strict bytestrings, two that use lazy
bytestrings, and two that use the standard Haskell strings.  Three use a
recursive function with an accumulator parameter and three use a foldl
with a lambda function.

Say the fastest one takes the time 1.  How much time will the others
take?

And how about memory?  How much memory do you think they require?  Let's
say we feed a 150MB(*) file into each of them, how many megabytes do you
think they end up using (as seen from the OS, not in terms of how big
the live heap is)?

I'm going to post full benchmarks + analysis on Wednesday.

-Peter

*) hardddisk megabytes.  The file is 150000034 bytes ≈ 143 mebibytes.


PS: For extra credit, what do you think is the peak memory use for this
    program when given an input file of 150MB?


{-# LANGUAGE BangPatterns #-}

import qualified Data.ByteString.Lazy.Char8 as B
import GHC.Int (Int64)

-- note that D.BS.Lazy.Char8.length is ByteString -> Int64
--           D.BS.C8.length is ByteString -> Int
cnt	:: B.ByteString -> Int64
cnt bs	= B.length (B.filter (== ' ') bs)

main = do s <- B.getContents
	  print (cnt s)




==============================
hs/space-bs-c8-acc-1.hs:
{-# LANGUAGE BangPatterns #-}

import qualified Data.ByteString.Char8 as B

cnt	:: Int -> B.ByteString -> Int
cnt !acc bs = if B.null bs
		then acc
	        else cnt (if B.head bs == ' ' then acc+1 else acc) (B.tail bs)

main = do s <- B.getContents
	  print (cnt 0 s)
==============================
hs/space-bslc8-acc-1.hs:
{-# LANGUAGE BangPatterns #-}

import qualified Data.ByteString.Lazy.Char8 as B

cnt	:: Int -> B.ByteString -> Int
cnt !acc bs = if B.null bs
		then acc
	        else cnt (if B.head bs == ' ' then acc+1 else acc) (B.tail bs)

main = do s <- B.getContents
	  print (cnt 0 s)
==============================
hs/space-xxxxx-acc-1.hs:
{-# LANGUAGE BangPatterns #-}

cnt	:: Int -> String -> Int
cnt !acc bs = if null bs
		then acc
	        else cnt (if head bs == ' ' then acc+1 else acc) (tail bs)

main = do s <- getContents
	  print (cnt 0 s)
==============================
hs/space-bs-c8-foldlx-1.hs:
{-# LANGUAGE BangPatterns #-}

import qualified Data.ByteString.Char8 as B

cnt	:: B.ByteString -> Int
cnt bs	= B.foldl' (\sum c -> if c == ' ' then sum+1 else sum) 0 bs

main = do s <- B.getContents
	  print (cnt s)
==============================
hs/space-bslc8-foldlx-1.hs:
{-# LANGUAGE BangPatterns #-}

import qualified Data.ByteString.Lazy.Char8 as B

cnt	:: B.ByteString -> Int
cnt bs	= B.foldl' (\sum c -> if c == ' ' then sum+1 else sum) 0 bs

main = do s <- B.getContents
	  print (cnt s)
==============================
hs/space-xxxxx-foldl.hs:
{-# LANGUAGE BangPatterns #-}

cnt	:: String -> Int
cnt bs	= foldl (\sum c -> if c == ' ' then sum+1 else sum) 0 bs

main = do s <- getContents
	  print (cnt s)
==============================




More information about the Haskell-Cafe mailing list