Parsec allocating a lot of memory

Derek Elkins ddarius@hotpop.com
Mon, 4 Aug 2003 15:27:41 -0400


On Mon, 4 Aug 2003 19:16:44 +0200
Nick Name <nick.name@inwind.it> wrote:

> Hi all, I am using "parsec" to parse the output from "xmame -listinfo"
> wich is a list of records of the form
> 
> game (
>    attr1 value1
>    ...
>    attrN valueN
> )
> 
> and for approx. 3500 records I got ~250 mb of RSS memory during
> parsing, wich takes 20 seconds on my athlon 1400.
> 
> I think that I must have done something wrong (this is the first time
> I use parsec), here is my parser:
> 
> --- Begin
> games = many game
> 
> game =>     do
>     openGame
>     x <- manyTill attribute closeGame
>     return (mkGameInfo x)
> 
> attribute = 
>     do
>     whitespaces
>     x <- identifier
>     whitespaces
>     y <- tillEOL
>     return (x,y)
> 
> openGame = 
>     do
>     string "game ("
>     newline
> 
> closeGame = 
>     do
>     string ")"
>     newline
>     newline
> 
> whitespace = "\v\f\t\r "
> whitespaces = skipMany (oneOf whitespace)
> 
> tillEOL = manyTill anyChar newline
> 
> identifier = many alphaNum
> --- End
> 
> Thanks for any advice
> 
> Vincenzo

I'm pretty sure the issue is the manyTill's.  Using the Language and
Token modules you can write this parser much more compactly and likely
more efficiently.  The Token combinators automatically handle whitespace
which is very probably what you want.

games = many game

game = do 
    symbol "game"
    attrs <- parens (many attribute)
    return (mkGameInfo attrs)

attribute = do 
    attr <- identifier
    val <- many (satisfy (not . ('\n'==)))
    spaces
    return (attr,val)