[Haskell-cafe] Re: Haskell Speed

Peter Simons simons at cryp.to
Sun Dec 25 06:24:38 EST 2005


Paul Moore writes:

 > It would be interesting to see standalone code for wcIOB
 > (where you're allowed to assume that any helpers you
 > need, like your block IO library, are available from the
 > standard library). This would help in comparing the
 > "obviousness" of the two approaches.

A simple version of the program -- which doesn't need any
3rd party modules to compile -- is attached below. My guess
is that this approach to I/O is quite obvious, too, if you
have experience with system programming in C.

IMHO, the main point of the example in the article is that

  wc :: String -> (Int, Int, Int)
  wc file = ( length (lines file)
            , length (words file)
            , length file
            )

is a crapy word-counting algorithm. I'm not sure whether
conclusions about functional programming in general or even
programming in Haskell can be derived from this code. Most
people seem to have trouble with lazy-evaluation, first of
all.

Peter



-- Compile with: ghc -O2 -funbox-strict-fields -o wc wc.hs

module Main ( main ) where

import System.IO
import Foreign

type Count = Int
data CountingState = ST !Bool !Count !Count !Count
                     deriving (Show)

initCST :: CountingState
initCST = ST True 0 0 0

wc :: Char -> CountingState -> CountingState
wc '\n' (ST _     l w c) = ST True (l+1)  w   (c+1)
wc ' '  (ST _     l w c) = ST True   l    w   (c+1)
wc '\t' (ST _     l w c) = ST True   l    w   (c+1)
wc  _   (ST True  l w c) = ST False  l  (w+1) (c+1)
wc  _   (ST False l w c) = ST False  l    w   (c+1)


bufsize :: Int                  -- our I/O buffer size
bufsize = 4096

type IOHandler st = Ptr Word8 -> Int -> st -> IO st

countBuf :: IOHandler CountingState
countBuf  _  0 st@(ST _ _ _ _) = return st
countBuf ptr n st@(ST _ _ _ _) = do
  c <- peek ptr
  let st' = wc (toEnum (fromEnum c)) st
  countBuf (ptr `plusPtr` 1) (n - 1) st'

loop :: Handle -> Int -> IOHandler st -> st -> IO st
loop h n f st' = allocaArray n (\ptr' -> loop' ptr' st')
  where
  loop' ptr st = st `seq` do
    rc <- hGetBuf h ptr n
    if rc == 0
       then return st
       else f ptr rc st >>= loop' ptr

main :: IO ()
main = do
  ST _ l w c <- loop stdin bufsize countBuf initCST
  putStrLn . shows l . (' ':) . shows w . (' ':) $ show c



More information about the Haskell-Cafe mailing list