[Haskell-cafe] Lazy HTML parsing with HXT, HaXML/polyparse,
what else?
Henning Thielemann
lemming at henning-thielemann.de
Mon May 21 09:09:25 EDT 2007
On Mon, 14 May 2007, Malcolm Wallace wrote:
> Henning Thielemann <lemming at henning-thielemann.de> wrote:
>
> > > > *Text.ParserCombinators.PolyLazy>
> > > > runParser (exactly 4 (satisfy Char.isAlpha))
> > > > ("abc104"++undefined)
> > > > ("*** Exception: Parse.satisfy: failed
> >
> > How can I rewrite the above example that it returns
> > ("abc*** Exception: Parse.satisfy: failed
>
> The problem in your example is that the 'exactly' combinator forces
> parsing of 'n' items before returning any of them. If you roll your
> own, then you can return partial results:
>
> > let one = return (:) `apply` satisfy (Char.isAlpha)
> in runParser (one `apply` (one `apply`
> (one `apply` (one `apply` return []))))
> ("abc104"++undefined)
> ("abc*** Exception: Parse.satisfy: failed
>
> Equivalently:
>
> > let one f = ((return (:)) `apply` satisfy (Char.isAlpha)) `apply` f
> in runParser (one (one (one (one (return []))))) ("abc104"++undefined)
> ("abc*** Exception: Parse.satisfy: failed
I wonder whether 'apply' merges two separate ideas: Applying a generated
function to some parser generated value and forcing some parser to always
succeed. From the documentation of 'apply' I assumed that 'apply f x'
fails if 'f' or 'x' fails. In contrast to that it seems to succeed if only
'f' succeeds. Wouldn't it be better to have an explicit 'force' which
declares a parser to never fail - and to return 'undefined' if this
assumption is wrong.
I have seen this 'force' in the MIDI loader of Haskore:
http://darcs.haskell.org/haskore/src/Haskore/General/Parser.hs
It would hold:
apply f x == do g <- f; fmap g (force x)
More information about the Haskell-Cafe
mailing list