[Haskell-beginners] parsec and source material with random order lines

Dudley Brooks dbrooks at runforyourlife.org
Tue Dec 25 10:20:59 CET 2012


I think you are right, this is probably the right track. A little more 
googling with permutation parsers gave me this, which is also about 
parsing iCal using parsec:
http://stackoverflow.com/questions/3706172/haskell-parsec-and-unordered-properties

I'll review all this and see if that solves the problem... Thank you!

Emmanuel


On Tue, Dec 25, 2012 at 3:28 AM, Brent Yorgey <byorgey at seas.upenn.edu 
<mailto:byorgey at seas.upenn.edu>> wrote:

    Hi Emmanuel,

    Sounds like you want a permutation parser, perhaps?  Check out

    http://hackage.haskell.org/packages/archive/parsec/latest/doc/html/Text-Parsec-Perm.html

    -Brent

    On Tue, Dec 25, 2012 at 12:18:37AM +0100, Emmanuel Touzery wrote:
     > Hi,
     >
     >  I'm trying to parse ical files but the source material doesn't
    matter
     > much. First, I know there is an icalendar library on hackage, but I'm
     > trying to learn as well through this.
     >
     >  Now the format is really quite simple and actually I'm parsing
    it, it
     > works, but I don't like the code I'm writing, it feels wrong and
    I'm sure
     > there is a better way. Actually for now I'm parsing it to an array of
     > arrays, but I want to fill a proper "data" structure.
     >
     >  For my purpose the file contains a bunch of records like this:
     >
     > BEGIN:VEVENT
     > DTSTART:20121218T103000Z
     > DTEND:20121218T120000Z
     > [..]
     > DESCRIPTION:
     > [..]
     > END:VEVENT
     >
     > There are a bunch of records I don't care about and also I want
    to parse no
     > matter what is the order of directives (so, i want to parse also
    if DTEND
     > appears before DTSTART for instance, and so on).
     >
     > That last part is my one problem. I can't do:
     >
     > parseBegin
     > start <- parseStart
     > end <- parseEnd
     > skipRows
     > desc <- parseDesc
     > skipRows
     > end <- parseEnd
     > return Event { eventStart = start, eventEnd = end ...}
     >
     > my current working code is:
     >
     > parseEvent = do
     >     parseBegin
     >     contents <- many1 $ (try startDate)
     > <|> (try endDate)
     > <|> (try description)
     > <|> unknownCalendarInfo
     >     parseEnd
     >     return contents
     >
     > But then contents of course returns an array, while I want to
    return only
     > one element here.
     >
     > SOMEHOW what I would like is:
     >
     > parseEvent = do
     >     parseBegin
     >     contents <- many1 $ (start <- T.try startDate)
     > <|> (end <- T.try endDate)
     > <|> (desc <- T.try description)
     > <|> unknownCalendarInfo
     >     parseEnd
     >     return Event { eventStart = start, eventEnd = end ...}
     >
     >  But obviously as far as Parsec is concerned startDate could
    occur several
     > times and also it's just not valid Haskell syntax.
     >
     >  So, any hint about this problem? Parsing multi-line records with
    Parsec,
     > when I don't know the order in which the lines will appear? I
    mean sure I
     > can convert my array to the proper data structure... I find which
    element
     > in the array contains the start date and then which contains the end
     > date... and build my data structure.. But I'm sure something much
    nicer can
     > be done... I just can't find how.
     >
     >  I see the author of iCalendar fixed the problem but I can't
    completely
     > understand his source, it's too many things at the same time for
    me, I need
     > to take this one step at a time.
     >
     >  Thank you!
     >
     > Emmanuel

     > _______________________________________________
     > Beginners mailing list
     > Beginners at haskell.org <mailto:Beginners at haskell.org>
     > http://www.haskell.org/mailman/listinfo/beginners


    _______________________________________________
    Beginners mailing list
    Beginners at haskell.org <mailto:Beginners at haskell.org>
    http://www.haskell.org/mailman/listinfo/beginners


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20121225/ee2cc3ca/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Attached Message Part
URL: <http://www.haskell.org/pipermail/beginners/attachments/20121225/ee2cc3ca/attachment.txt>


More information about the Beginners mailing list