[Haskell-cafe] State monad strictness (was: ... abysmal Language
Shootout results)
Peter Simons
simons at cryp.to
Thu Sep 30 11:06:32 EDT 2004
How can anyone stay away from such a deliciously pointless
waste of time as implementing a wc(1) derivate? :-)
Here is my attempt:
> import IO
>
> type Count = Int
> data CountingState = ST !Bool !Count !Count !Count
> deriving (Show)
>
> initCST = ST True 0 0 0
>
> wc :: CountingState -> [Char] -> CountingState
> wc (ST _ l w c) ('\n':xs) = wc (ST True (l+1) w (c+1)) xs
> wc (ST _ l w c) (' ' :xs) = wc (ST True l w (c+1)) xs
> wc (ST _ l w c) ('\t':xs) = wc (ST True l w (c+1)) xs
> wc (ST True l w c) ( x :xs) = wc (ST False l (w+1) (c+1)) xs
> wc (ST False l w c) ( x :xs) = wc (ST False l w (c+1)) xs
> wc st [] = st
>
> main :: IO ()
> main = do
> ST _ l w c <- getContents >>= return . wc initCST
> putStrLn $ (l `shows`) . spaces . (w `shows`) . spaces . (c `shows`) $ []
> where spaces = (' ':) . (' ':) . (' ':)
I compiled this with "ghc -O2 -funbox-strict-fields" and got
the following performance results in a simple test.
The wc(1) tool:
$ time /usr/bin/wc </usr/share/dict/words
234937 234937 2486824
real 0m0.069s
user 0m0.059s
sys 0m0.008s
My version:
$ time ./wc </usr/share/dict/words
234937 234937 2486824
real 0m0.406s
user 0m0.322s
sys 0m0.060s
The version from the shootout pages:
$ time ./wc-shootout </usr/share/dict/words
234937 234937 2486824
real 0m2.749s
user 0m2.682s
sys 0m0.062s
Then I made another experiment. I figured, the code above
yells out to be written with a State monad. So I did that:
> import IO
> import Control.Monad.State
>
> type Count = Int
> data CountingState = ST !Bool !Count !Count !Count
> deriving (Show)
>
> type WordCounter = State CountingState ()
>
> initCST = ST True 0 0 0
>
> wc :: Char -> WordCounter
> wc x = get >>= \(ST b l w c) ->
> case (b,x) of
> ( _ , '\n') -> put (ST True (l+1) w (c+1))
> ( _ , '\t') -> put (ST True l w (c+1))
> ( _ , ' ' ) -> put (ST True l w (c+1))
> (True, _ ) -> put (ST False l (w+1) (c+1))
> (False, _ ) -> put (ST False l w (c+1))
>
> main :: IO ()
> main = do
> xs <- getContents
> let ST _ l w c = snd (runState (mapM wc xs) initCST)
> putStrLn $ (l `shows`) . spaces . (w `shows`) . spaces . (c `shows`) $ []
> where
> spaces = (' ':) . (' ':) . (' ':)
Curiously enough, this version fails to process the "words"
file because it runs out of stack space! Naturally, it is
very slow, too. So I wonder: How needs that program above to
be changed in order to solve this space leak?
Why does this happen in the first place?
Peter
More information about the Haskell-Cafe
mailing list