[Haskell-cafe] Newbie: a parser for a list of objects?

Tue Mar 27 06:15:49 EDT 2007

Thanks Daniel!
Things are getting more in shape, yet I still can not fully comprehend the
expression:

((p >*> pList p) `build` (uncurry (:)))

where

 (>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
 (>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2
rem1]

 build :: Parse a b -> (b -> c) -> Parse a c
 build p f inp = [ (f x, rem) | (x, rem) <- p inp]

So in fact recursive application:

p >*> pList p

should unfold in something like:

((p >*> p) >*> p) >*> p ...

and *all*  iterations of

p >*> pList p

will be done *before* 'build' will be applied?

Correct?

Thanks,
Dima

On 3/26/07, Daniel Fischer <daniel.is.fischer at web.de> wrote:
>
> > -----Ursprüngliche Nachricht-----
> > Von: "Dmitri O.Kondratiev" <dokondr at gmail.com>
> > Gesendet: 26.03.07 16:44:12
> > An: haskell-cafe at haskell.org
> > Betreff: [Haskell-cafe] Newbie: a parser for a list of objects?
>
> > Please see my questions inside comments {-- --} :
> > Thanks!
> >
> > ---
> > module Parser where
> >
> > import Data.Char
> >
> > type Parse a b = [a] -> [(b, [a])]
> >
> > {--
> > Newbie: a parser for a list of objects?
> >
> > I am working with the section  17.5 "Case study: parsing expressions" of
> the book "Haskell The Craft of Functional Programming", where a parser for a
> list of objects is defined.
> > I called this function pList in order to avoid confusion with 'list' as
> a term for data structure.
> >
> > Please help me to understand how pList works (please, see the rest of
> the code at the end of this message):
> > --}
> >
> > pList :: Parse a b -> Parse a [b]
> > pList p = (succeed []) `alt`
> >          ((p >*> pList p) `build` (uncurry (:)))
> >
> >
> > {--
> > First of all, I don't quite understand why there must be a choice
> ('alt') between the function ('succeed') that always returns an empty list
> and the other part? This results in adding [] to the front, why?
> >
>
> Well, if the parser p doesn't succeed, we don't want the whole thing to
> fail. And p will (almost certainly) fail when the end of input is reached.
> So without the alternative 'succeed []', we'd get
>
> pL1 dig "12"  = [(('1':y),rem) | (y,rem) <- pL1 dig "2"]
>                    = [(('1':y),rem) | (y,rem) <- [(('2':z),rem2) |
> (z,rem2) <- pL1 dig ""]]
>                    = [(('1':y),rem) | (y,rem) <- [(('2':z),rem2) |
> (z,rem2) <- []]
>                    = [(('1':y),rem) | (y,rem) <- []]
>                    = []
>
> because dig "" = []
>
> > I thought that 'simplified' version of pList should still work fine.
> Trying to prove this I wrote :
> > --}
> >
> > pL1 :: Parse a b -> Parse a [b]
> > pL1 p = (p >*> pL1 p) `build` (uncurry (:))
> >
> > {--
> > Which, as expected, does not work correctly - just gives an empty list
> [] -  but I don't understand why:
>
> because the parser eventually fails when the end of input is reached.
> >
> > *Parser> t1 "12345"
> > []
> > *Parser>
> >
> > Also, I don't understand why the textbook version of pList gives this
> result:
> >
> > *Parser> test "12345"
> >
> [("","12345"),("1","2345"),("12","345"),("123","45"),("1234","5"),("12345","")]
>
> That's because of the order of alt's arguments:
>
> (succeed [] `alt` p) inp = [([],inp)] ++ (p inp)
>
> with pList p = ((p >*> pList p) `build` (uncurry (:))) `alt` succeed []
> the resulting list woulde be reversed.
>
> >
> > *Parser>
> >
> > In particular, I don't understand where the first element ("","12345")
> of the resulting list comes from?
> >
> > I am trying to figure out how pList recursively unfolds. To my mind
> operators in the expression:
> >
> >
> > (succeed []) `alt`((p >*> pList p) `build` (uncurry (:)))
> >
> > has the following execution order:
> > 1)  >*>
> > 2) 'build'
> > 3) 'alt'
> >
> No, the first argument of alt gets evaluated first, because (p1 `alt` p2)
> inp = (p1 inp) ++ (p2 inp), thus we need p1 inp first.
> Then we see we haven't hit bottom, so we need the second argument of (++)
> (resp. alt).
> So next we need to evaluate p, then pList p, combine the results of those
> with the second argument of build, uncurry (:).
>
> > It seems that operation >*> should be done as many times as many
> elements the input list has. Right?
> >
>
> Unfortunately not. Let's stay with pList dig. Say your input starts with n
> digits.
> From the example above you can conjecture that length (pList dig inp) ==
> (n+1).
> Now in the outermost (dig >*> pList dig) branch, you apply (pList dig) to
> an input beginning with (n-1) digits, returning a list of length n,
> to each element of this list you adjoin the first digit, resulting in n +
> (n-1) + ... + 1 = n*(n+1)/2 applications of (>*>).
> (Lesson: you need an exclusive choice, using the second parser only if the
> first one fails and a maximal munch combinator in your library, too)
>
> >
> > Signature:
> >
> > (>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
> >
> > implies that second argument of the expression:
> >
> > p >*> pList p
> >
> > should be of type 'Parse a c' but in this application it is of type
> 'Parse a b -> Parse a [b]'
> >
> c is [b], so p >*> pList p has type Parse a (b,[b]), then
> (p >*> pList p) `build` (uncurry (:)) has type Parse a [b]
>
> > How can that be?
> > How recursion termination conditinon is expressed in pList?
>
> recursion terminates when p fails.
>
> HTH,
> Daniel
>
> > --}
> >
> > none :: Parse a b
> > none inp = []
> >
> > succeed :: b -> Parse a b
> > succeed val inp = [(val, inp)]
> >
> > suc:: b -> [a] -> [(b, [a])]
> >
> > suc val inp = [(val, inp)]
> >
> > spot :: (a -> Bool) -> Parse a a
> > spot p [] = []
> > spot p (x:xs)
> >      | p x = [(x, xs)]
> >      | otherwise = []
> >
> > alt :: Parse a b -> Parse a b -> Parse a b
> > alt p1 p2 inp = p1 inp ++ p2 inp
> >
> > bracket = spot (=='(')
> > dash = spot (== '-')
> > dig = spot isDigit
> > alpha = spot isAlpha
> >
> > infixr 5 >*>
> >
> > (>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
> >
> > (>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2
> rem1]
> >
> > build :: Parse a b -> (b -> c) -> Parse a c
> > build p f inp = [ (f x, rem) | (x, rem) <- p inp]
> >
> > test = pList dig
> > t1 = pL1 dig
> >
> >
> > -----------------------------------------------------------------
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20070327/ce3d0eab/attachment.htm