[Haskell-cafe] Speed of character reading in Haskell

Fri Sep 7 02:28:42 EDT 2007

> I started with the obvious
> 	main = getContents >>= print . tokenise
> where tokenise maps its way down a list of characters.  This is very
> simple, very pleasant, and worked like a charm.
> However, the language has an INCLUDE directive, so I'm going to have
> to call readFile or something in the middle of tokenising, so the
> main tokeniser loop can't be a pure String -> [Token] function any
> more.

What about

> tokenise :: [String] -> ([Token],[FilePath])
> main = print . fst =<< mfix process where
>     process (tokens,paths) = do
>         mainContents <- getContents
>         includes <- mapM readFile paths
>         return $ tokenise $ mainContents : includes

I guess, it would be useful to replace ([Token],[FilePath]) with Writer [FilePath] [Token]

> Method 1A (pure list processing)
>      main = getContents >>= print . doit 0
>      doit n ('\n':cs) = doit (n+1) cs
>      doit n ( _  :cs) = doit  n    cs
>      doit n []        = n

I think, you should use something like (doit $! n+1) cs here.

> In *retrospect*, it is really obvious why this was
> necessary, but I must say that in *prospect* I wasn't expecting it.

In fact, I was expecting this to be an issue even for 1A. I suppose, GHC is smart enough to suppress lazyness in the first method.