[Haskell-beginners] parsec and source material with random order lines

Emmanuel Touzery etouzery at gmail.com
Wed Dec 26 15:13:49 CET 2012


Hello,

Well, now I've checked it in further detail... a permutation parser is
basically what I want.

But there is a but. But here I'm really pushing it, it's really my problem
at this point.

The problem is that if let's say the parsing unit is the line (which is my
situation), if there are 6 lines and they contain data and I don't know the
order... The problem is that the permutation parser requires that I know
how to parse and give to my data constructor all 6 lines, while in reality
I only care about 3 of those 6 lines. I can't give a parser with which I
would discard lines, "this I don't parse".

And it makes sense... It's just that in my case, I don't want to load all
the possible fields contained in a iCalendar file, only a couple of them
matter to me.

I think I'll pre-process the data to filter only the data I care about
(filter out all directives I don't understand, I can do this by simply
checking what are the first few characters on each line) and then I give
that to parsec and that's a winning combination.

Just let me know if there is a more elegant way, but it's starting to be a
bit of a messy situation (I don't know the order, and I don't want to use
all the input data..), so I'm not sure there is.

Thank you a lot!

Emmanuel


On Tue, Dec 25, 2012 at 3:28 AM, Brent Yorgey <byorgey at seas.upenn.edu>wrote:

> Hi Emmanuel,
>
> Sounds like you want a permutation parser, perhaps?  Check out
>
>
> http://hackage.haskell.org/packages/archive/parsec/latest/doc/html/Text-Parsec-Perm.html
>
> -Brent
>
> On Tue, Dec 25, 2012 at 12:18:37AM +0100, Emmanuel Touzery wrote:
> > Hi,
> >
> >  I'm trying to parse ical files but the source material doesn't matter
> > much. First, I know there is an icalendar library on hackage, but I'm
> > trying to learn as well through this.
> >
> >  Now the format is really quite simple and actually I'm parsing it, it
> > works, but I don't like the code I'm writing, it feels wrong and I'm sure
> > there is a better way. Actually for now I'm parsing it to an array of
> > arrays, but I want to fill a proper "data" structure.
> >
> >  For my purpose the file contains a bunch of records like this:
> >
> > BEGIN:VEVENT
> > DTSTART:20121218T103000Z
> > DTEND:20121218T120000Z
> > [..]
> > DESCRIPTION:
> > [..]
> > END:VEVENT
> >
> > There are a bunch of records I don't care about and also I want to parse
> no
> > matter what is the order of directives (so, i want to parse also if DTEND
> > appears before DTSTART for instance, and so on).
> >
> > That last part is my one problem. I can't do:
> >
> > parseBegin
> > start <- parseStart
> > end <- parseEnd
> > skipRows
> > desc <- parseDesc
> > skipRows
> > end <- parseEnd
> > return Event { eventStart = start, eventEnd = end ...}
> >
> > my current working code is:
> >
> > parseEvent = do
> >     parseBegin
> >     contents <- many1 $ (try startDate)
> >             <|> (try endDate)
> >             <|> (try description)
> >             <|> unknownCalendarInfo
> >     parseEnd
> >     return contents
> >
> > But then contents of course returns an array, while I want to return only
> > one element here.
> >
> > SOMEHOW what I would like is:
> >
> > parseEvent = do
> >     parseBegin
> >     contents <- many1 $ (start <- T.try startDate)
> >             <|> (end <- T.try endDate)
> >             <|> (desc <- T.try description)
> >             <|> unknownCalendarInfo
> >     parseEnd
> >     return Event { eventStart = start, eventEnd = end ...}
> >
> >  But obviously as far as Parsec is concerned startDate could occur
> several
> > times and also it's just not valid Haskell syntax.
> >
> >  So, any hint about this problem? Parsing multi-line records with Parsec,
> > when I don't know the order in which the lines will appear? I mean sure I
> > can convert my array to the proper data structure... I find which element
> > in the array contains the start date and then which contains the end
> > date... and build my data structure.. But I'm sure something much nicer
> can
> > be done... I just can't find how.
> >
> >  I see the author of iCalendar fixed the problem but I can't completely
> > understand his source, it's too many things at the same time for me, I
> need
> > to take this one step at a time.
> >
> >  Thank you!
> >
> > Emmanuel
>
> > _______________________________________________
> > Beginners mailing list
> > Beginners at haskell.org
> > http://www.haskell.org/mailman/listinfo/beginners
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20121226/fcf2993c/attachment.htm>


More information about the Beginners mailing list