[Haskell-cafe] Parsec memory behavior

Wed May 17 14:36:00 EDT 2006

   Try this simple program:

import Text.ParserCombinators.Parsec

ppAny = tokenPrim show (\p _ _ -> p) (\t -> Just t)
ppTest = many ppAny

p s =
    case runParser ppTest True "" s of
        Left error -> show error
        Right result -> result

main =  do  let fname = "C:/main.i"
            f <- readFile fname

            let tokens = p f

            putStrLn . show $ head tokens

   This is the simplest expression of using Parsec, it just returns the 
input unchanged. But it already seems to have a big memory leak. When 
parsing a 2 MB file, I'm seeing the program grow in size up to about 160 
MB before it ends.

   I assume this is known behavior. I believe it's because of the way 
Parsec deals with errors: when you chain two parsers in sequence, the 
result of the first one (in this case the head of the list) is not 
available until it knows if the second one failed or not.

   Is there any known alternative that doesn't exhibit this behavior? It 
would have to somehow return errors inline or on a "side channel". I'll 
be toying with this sort of thing for a while.

   Thanx!

JCAB