[Haskell-cafe] Newbie: a parser for a list of objects?
Dmitri O.Kondratiev
dokondr at gmail.com
Thu Mar 29 09:36:28 EDT 2007
Daniel,
New combinator (<:>) that you introduced helps a lot to understand the whole
thing. I think that your explanation should be included in the next edition
of the "Haskell. The Craft of Functional Programming", I really mean it.
To summarize how I now understand the parser:
Using your combinators, for example:
pList dig "123"
unfolds into:
succeed []
<+> (dig <:> succeed [])
<+> (dig <:> (dig <:> succeed []))
<+> (dig <:> (dig <:> (dig <:> succeed [])))
<+> (dig <:> (dig <:> (dig <:> (dig <:> pList dig))))
where:
succeed [] ~~> [("", "123")]
(dig <:> succeed []) ~~> [("1", "23")]
(dig <:> (dig <:> succeed [])) ~~> [("12", "3")]
(dig <:> (dig <:> (dig <:> succeed []))) ~~> [("123", "")]
(dig <:> (dig <:> (dig <:> (dig <:> pList dig)))) ~~> []
the last one returns [] because:
(dig >*> dig >*> dig >*> dig) "123" ~~> []
As a result we get:
[("", "123")] ++ [("1", "23")] ++ [("12", "3")] ++ [("123", "")] ++ []
~~> [("", "123"), ("1", "23"), ("12", "3"), ("123", "")]
Thanks again Daniel for your great help!
Dima
On 3/28/07, Daniel Fischer <daniel.is.fischer at web.de> wrote:
>
> Am Mittwoch, 28. März 2007 11:57 schrieb Dmitri O.Kondratiev:
> > Daniel,
> > I am still trying to figure out the order of function applications in
> the
> > parser returning list of objects (I attached again the code to the end
> of
> > this message for convenience).
> >
> > You wrote:
> > (>*>) associates to the right, hence
> > p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
> >
> > I don't understand where (p >*> succeed []) comes from?
>
> The final 'succeed []' comes from a) the definition of pList p as
> pList p = succeed [] `alt` ((p >*> pList p) `build` (uncurry (:)))
> plus b) the assumption that p should be a parser which doesn't succed on
> an
> empty input and that the input is finite (though the second point is not
> necessary).
>
> Let us unfold a little:
>
> pList dig "12ab"
> === succeed [] "12ab" ++ (((dig >*> pList dig) `build` (uncurry
> (:))) "12ab")
> === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- pList dig
> "2ab"]
> -- since dig "12ab" = [('1',"2ab")]
> === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- (succed []
> `alt`
> (((dig >*> pList dig)
> `build` (uncurry (:))))) "2ab"]
> === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- ([([],"2ab")]
> ++
> [('2' : ds2,rem2) | (ds2,rem2) <- pList
> dig "ab"])]
> === [([],"12ab"),("1","2ab")] ++
> [('1' : '2' : ds2,rem2) | (ds2,rem2) <- (succeed [] `alt`
> (((dig >*> pList dig) `build` (uncurry (:))))) "ab"]
> === [([],"12ab"),("1","2ab")] ++
> [('1' : '2' : ds2,rem2) | (ds2,rem2) <- ([([],"ab")] ++
> (((dig >*> pList dig) `build` (uncurry (:))) "ab"))]
> -- now 'dig' and hence 'dig >*> pList dig' fail on the input
> "ab", thus
> === [([],"12ab"),("1","2ab"),("12","ab")]
>
> Hum, I find that a bit hard to read myself, so let's introduce an alias
> for
> 'alt', call it (<+>) and a new combinator which joins (>*>) and
> 'build (uncurry (:))' :
> (<:>) :: Parser a b -> Parser a [b] -> Parser a [b]
> p1 <:> p2 = \inp -> [(x:ys,rem2) | (x,rem1) <- p1 inp, (ys,rem2) <- p2
> rem1]
> -- or p1 <:> p2 = build (p1 >*> p2) (uncurry (:))
>
> Then we have (because p1 <:> (p2 <+> p3) === (p1 <:> p2) <+> (p1 <:> p3))
> pList p
> === succeed [] <+> (p <:> pList p)
> === succeed [] <+> (p <:> (succeed [] <+> (p <:> pList p)))
> === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> pList p))
> === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> (succeed
> [] <+>
> (p <:> pList p))))
> === succeed []
> <+> (p <:> succeed [])
> <+> (p <:> (p <:> succeed []))
> <+> (p <:> (p <:> (p <:> succeed [])))
> <+> (p <:> (p <:> (p <:> (p <:> pList p))))
> and so on.
> And when we request more p's than the input provides, pList p isn't
> reached
> anymore and recursion stops (e.g. with p = dig and input "123" or
> "123a45",
> the last line will fail because it demands four digits from the beginning
> of
> the input, but there are only three).
> If p would succeed on an empty input, e.g. p = succeed 1 or the input is
> an
> infinite list of successes for p, e.g. p = dig and input = cycle "123",
> the
> unfolding would never stop, producing an infinite list of results, but
> each
> of these results wolud come from a finite chain of p's ended by a
> 'succeed []'.
>
> So the order of evaluation of
> pList p input = (succeed [] <+> (p <:> pList p)) input
> = succeed [] input ++ (p <:> pList p) input
> is
> 1. evaluate the first argument of (++), succeed [] input == [([],input)]
> Since this is not _|_, we need also the second argument of (++), so
> 2. evaluate (p <:> pList p) input (to whnf first, more if required)
> 3. evaluate (++) as far as needed
>
> 2. is further split,
> 2.1. evaluate p input, giving a list of (obj,rem) pairs -- if that's
> empty,
> we're done, also if that produces _|_
> 2.2. (partially) evaluate pList p rem (goto 1.) giving a list of
> (objlist,rem2); [([],rem),([obj2],rem'),([obj2,obj3],rem'')...]
> 2.3. return the list of (obj:objlist,rem2) pairs
>
> >
> > Yet, if the order is as you describe, everything goes well, for example:
> >
> > comp1 = dig >*> dig has type - Parser char (char, char)
> > comp2 = dig >*> (succeed []) has type - Parser char (char, [a])
> > pl1 = comp2 `build` (uncurry (:)) has type - Parser char (char, [char])
>
> pl1 has type Parser Char [Char] because 'uncurry (:)' has type (a,[a]) ->
> [a]
>
> >
> > At first run
> > (succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
> >
> > should be:
> > [] ++ ((p >*> pList p) `build` (uncurry (:)))
>
> (succeed [] `alt` ((p >*> pList p) `build` (uncurry (:)))) input
> gives
> [([],input)] ++ ((p >*> pList p) `build` (uncurry (:))) input
> >
> > so how we get:
> > (p >*> succeed []) ?
> >
> > Thanks,
> > Dima
> >
> Anytime,
> Daniel
>
>
--
Dmitri O Kondratiev
dokondr at gmail.com
http://www.geocities.com/dkondr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20070329/bd6479ae/attachment-0001.htm
More information about the Haskell-Cafe
mailing list