[Haskell-cafe] Parsec bug, or...?

Tue Oct 20 09:21:12 EDT 2009

After some private exchange of info between Uwe and me it became clear  
that it may not have been immediately clear that the error messages in  
my original posting where actually part of a more involved process.

By removing the optional part in the pCommand (i.e. making the part  
starting with `opt` invisible, we can show the full power of the built- 
in error correction.

import Text.ParserCombinators.UU.Parsing

pCommand [] = pure []
pCommand (x:xs) = ((:) <$> pSym x <*> pCommand xs) -- `opt` (x:xs)

pCommands = amb . foldr (<|>) pFail . map pCommand $ ["banana",  
"chocolate", "frito", "fromage"]

t :: String -> ([String], [Error Char Char Int])
t input = parse ( (,) <$> pCommands <*> pEnd)  (listToStr input)

Now one also gets error messages like:

*Main> t "frxi"
(["frito"],[
Deleted  'x' at position 2 expecting one of ['i', 'o'],
Inserted 't' at position 4 expecting 't',
Inserted 'o' at position 4 expecting 'o'])

for free.

However, since the decision what element to take is based on a limited  
look-ahead, one also gets:

*Main> t "xfxrxix"
(["fromage"],[
Deleted  'x' at position 0 expecting one of ['b', 'c', 'f', 'f'],
Deleted  'x' at position 2 expecting 'r',
Deleted  'x' at position 4 expecting 'o',
Deleted  'i' at position 5 expecting 'o',
Deleted  'x' at position 6 expecting 'o',
Inserted 'o' at position 7 expecting 'o',
Inserted 'm' at position 7 expecting 'm',
Inserted 'a' at position 7 expecting 'a',
Inserted 'g' at position 7 expecting 'g',
Inserted 'e' at position 7 expecting 'e'])
*Main>

which is something not completely expected; the current look-ahead is  
however three symbols ahead, and once a decision is taken this is not  
reconsidered (for cost reasons). This is currently a consequence of  
the rather simplistic inner organisation of the intermediate library.  
In the next version we hope to have gotten rid of this artefact.

Best,
Doaitse

>
> Yes, for my particular problem the complexity is very limited. I
> wouldn't even have used parsec for this, in spite of the comment I had
> made earlier about this, if I were not already using it in a different
> part of the project to parse individual records ("buy security <foo>
> for this price on this date", etc), so it was natural to add a bit
> more parsec code to also deal with the commands saying what I want to
> see from the data. It's all still pretty trivial, but starting already
> to be useful to me... it's really quite gratifying what a small amount
> of haskell code suffices to make a useful and flexible program.
>
> best regards,
> Uwe
>
> On 10/15/09, S. Doaitse Swierstra <doaitse at swierstra.net> wrote:
>>
>> On 15 okt 2009, at 16:58, Uwe Hollerbach wrote:
>>
>>> Hi, all, thanks for the further inputs, all good stuff to think
>>> about... although it's going to be a little while before I can
>>> appreciate the inner beauty of Doaitse's version! :-)
>>
>> The nice thing is that you do not have to understand the inner
>> workings ;-} I basically builds a greedy parser for each word to be
>> recognised which can stop and assume the rest is there if it can no
>> longer proceed (the `opt` is greedy in its left alternative) . Hence
>> it recognises the longest possible prefix.
>> Since my parsers pursue all alternatives in parallel you  
>> automatically
>> get what you want, without having to indicate prefix lengths, calls  
>> to
>> try, etc.
>>
>> The "amb" combinator has type
>>
>> amb :: Parser a -> Parser [a]
>>
>> and collects the result from all alternatives its argument parser is
>> constructed from; you might say it convert an ambiguous parser to a
>> parser with a list as result, hence preventing the rest of the input
>> being parsed over and over again. I am currently working on bringing
>> back more abstract interpretation in the implementation (i.e. what we
>> have had for almost 10 years in the uulib library), but I do not
>> expect you to see a lot of that from the outside.
>>
>> If you want to work with left-recursive parsers (which does not seem
>> to be the case), you may revert to more complicated solutions such as
>> found in the "christmastree" (Changing Haskell's Read Implementation
>> Such That by Manipulationg Abstract Syntax Trees Read Evaluates
>> Efficiently) package if you need to generate parsers online, or to
>> happy-based solutions in case your grammar is fixed.
>>
>>
>> If you have any questions do not hesitate to ask,
>> Doaitse
>>
>>
>>> I had considered
>>> the approach of doing a post-parsec verification, but decided I  
>>> wanted
>>> to keep it all inside the parser, hence the desire to match prefixes
>>> there (and lack of desire to write 'string "p" <|> string "pr" <|>
>>> string "pre" ...'.
>>>
>>> By way of background, the actual stuff I'm wanting to match is not
>>> food names, but some commands for a small ledger program I'm working
>>> on. I needed something like that and was tired of losing data to
>>> quicken every so often. I realize of course that there are other
>>> excellent ledger-type programs out there, but hey, I also needed
>>> another hacking project. I'll put this onto hackage in a while, once
>>> it does most of the basics of what I need. No doubt the main
>>> differentiator between mine and those other excellent ledger  
>>> programs
>>> out there will be that mine has fewer features and more bugs...
>>>
>>> thanks again, all!
>>>
>>> Uwe
>>
>>