Space handling with Parsec

Thu Nov 8 07:41:47 EST 2007

Matthew Danish <mdanish at andrew.cmu.edu> writes:

> On Mon, Nov 05, 2007 at 01:18:02PM +0200, Anakreon Mendis wrote:
>> What I tried to do is:
>> imported the Parsec module, hiding the `space`.
>> Reimplemented the space parser as
>> space = oneOf [' ', '\t']
>> 
>> The parser behaved the same.
>> 
>> Afterwords, I implemented the whiteSpace parser
>> as
>> whiteSpace = skipMany space
>> 
>> This didn't work either.
>
> If you're using the lexing functionality found in Parsec.Token then
> keep in mind that those combinators will drop whitespace
> automatically.  Shadowing the `space' function doesn't affect any
> other modules.  If you want whitespace sensitivity then you'll have to
> write or use a separate scanner, as described in the manual.
It's a pity.
I do not want to implement a scanner when Parsec does the job
well (except for newline treatment).

I wander if parsec could be extended in two issues.

1:Accept in LanguageDef a parser for line comments instead of a char.
In basic comments start either with "'" or "REM".
2:Allow to configure in LanguageDef the treatment of newline and whitespace
  in general.

As a workaround, before the input is delivered to the parser, the newline
is changed into $\n and it works.