[Haskell-cafe] Lazy Parsing
Guenther Schmidt
gue.schmidt at web.de
Fri May 29 07:38:59 EDT 2009
Dear Doaitse,
In the days since my original post I had already come to favor the
uu-parsing package. I have printed the report and read it every day to
figure out how to use it. I cannot follow everything yet, and also hope
that won't be necessary in order to use it. :-)
My progress is a bit slow, but I'm not giving up. What I do like most,
over the other combinatory packages, is the approach of using
"breadth-first" when it comes to choice, the idea is certainly
enlightening. The packages capability to do "online- / partial parsing"
is essential for me.
I am a bit surprised about it's "raw" state. The basic combinators and
primitives are there but combinators like pChain, pDigit etc. are not
predefined and merely present in the examples package.
I had gotten quite comfortable with parsec and need to find the right
way to "translate" my parsec code to your package.
Anyway let me thank you for your work, I really appreciate it very much.
Günther
S. Doaitse Swierstra schrieb:
> Lazy parsing has been the default for the last ten years in uulib, and
> is now available in the simple uu-parsinglib
> (http://hackage.haskell.org/cgi-bin/hackage-scripts/package/uu-parsinglib).
> The whole design of the latter in described in a technical report to
> which references are given on the web page. It provides also error
> correction, the ability to use several different kinds of input
> tokens, and (with some help) ambiguities. If speed is an issue you can
> insert extra hints which locally change the breadth-first parsing
> process locally into a somewhat more depth-first form. When compared
> with Parsec the good news is that usually you do not have to put
> annotations to get nice results.
>
> The older uulib version also performs an abstract interpretation which
> basically changes the search for which alternative to take from a
> linear to a logarithmic complexity, but does not provide a monadic
> structure, in which you use results recognised thus far to construct
> new parsers.
>
> Both the old uulib version and the new version have always had an
> applicative interface.
>
> In the near future elements of the abstract interpretation of the old
> uulib version will migrate into the new version. It is the advent of
> GADT's which made this new version feasable.
>
> An example of the error correction at work at the following example code:
>
> pa, pb, paz :: P_m (Str Char) [Char]
> pa = lift <$> pSym 'a'
> pb = lift <$> pSym 'b'
> p <++> q = (++) <$> p <*> q
> pa2 = pa <++> pa
> pa3 = pa <++> pa2
>
> pCount p = (\ a b -> b+1) <$> p <*> pCount p <<|> pReturn 0
> pExact 0 p = pReturn []
> pExact n p = (:) <$> p <*> pExact (n-1) p
>
> paz = pMany (pSym ('a', 'z'))
>
> paz' = pSym (\t -> 'a' <= t && t <= 'z', "a .. z", 'k')
>
> main :: IO ()
> main = do print (test pa "a")
> print (test pa "b")
> print (test pa2 "bbab")
> print (test pa "ba")
> print (test pa "aa")
> print (test (do l <- pCount pa
> pExact l pb) "aaacabbb")
> print (test (amb ( (++) <$> pa2 <*> pa3 <|> (++) <$> pa3 <*> pa2))
> "aaabaa")
> print (test paz "ab1z7")
> print (test paz' "m")
> print (test paz' "")
>
>
> is
>
> loeki:~ doaitse$ ghci -package uu-parsinglib
> GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help
> Loading package ghc-prim ... linking ... done.
> Loading package integer ... linking ... done.
> Loading package base ... linking ... done.
> Loading package syb ... linking ... done.
> Loading package array-0.2.0.0 ... linking ... done.
> Loading package filepath-1.1.0.1 ... linking ... done.
> Loading package old-locale-1.0.0.1 ... linking ... done.
> Loading package old-time-1.0.0.1 ... linking ... done.
> Loading package unix-2.3.1.0 ... linking ... done.
> Loading package directory-1.0.0.2 ... linking ... done.
> Loading package process-1.0.1.1 ... linking ... done.
> Loading package random-1.0.0.1 ... linking ... done.
> Loading package haskell98 ... linking ... done.
> Loading package uu-parsinglib-2.0.0 ... linking ... done.
> Prelude> :m Text.ParserCombinators.UU.Examples
> Prelude Text.ParserCombinators.UU.Examples> main
> ("a",[])
> ("a",[
> Deleted 'b' at position 0 expecting one of ["'a'"],
> Inserted 'a' at position 1 expecting one of ["'a'"]])
> ("aa",[
> Deleted 'b' at position 0 expecting one of ["'a'"],
> Deleted 'b' at position 1 expecting one of ["'a'"],
> Deleted 'b' at position 3 expecting one of ["'a'"],
> Inserted 'a' at position 4 expecting one of ["'a'"]])
> ("a",[
> Deleted 'b' at position 0 expecting one of ["'a'"]])
> ("a",[
> The token 'a'was not consumed by the parsing process.])
> (["b","b","b","b"],[
> Deleted 'c' at position 3 expecting one of ["'a'","'b'"],
> Inserted 'b' at position 8 expecting one of ["'b'"]])
> (["aaaaa"],[
> Deleted 'b' at position 3 expecting one of ["'a'","'a'"]])
> ("abz",[
> Deleted '1' at position 2 expecting one of ["'a'..'z'"],
> The token '7'was not consumed by the parsing process.])
> ('m',[])
> ('k',[
> Inserted 'k' at position 0 expecting one of ["a .. z"]])
> Prelude Text.ParserCombinators.UU.Examples>
>
> Doaitse Swierstra
>
>
>
>
>
> On 27 mei 2009, at 01:52, GüŸnther Schmidt wrote:
>
>> Hi all,
>>
>> is it possible to do lazy parsing with Parsec? I understand that one
>> can do that with polyparse, don't know about uulib, but I happen to
>> be already somewhat familiar with Parsec, so before I do switch to
>> polyparse I rather make sure I actually have to.
>>
>> The files it has to parse is anywhere from 500 MB to 5 GB.
>>
>>
>> Günther
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
More information about the Haskell-Cafe
mailing list