[Haskell-cafe] Parsec to parse tree structures?

david fries djf at gmx.ch
Sun Mar 14 13:28:45 EDT 2010


Hi Stephen

Perhaps my description of the format was a bit unclear. When I said
pointer I simply meant a number which is the position in the stream. 

Imagine the tables looking something like this:

RootTable
   HeaderMagicNumber (1Byte): 0x50
   VersionNumber (2 Bytes): 1234
   SubTablePointer (4 Bytes): 200 ------------.
...Some more fields of the root table         ¦
                                              ¦
                                              ¦
SubTable (positioned at byte 200 of the file)<'
   SomeFlags (1 Byte): 0x00
   Comment (Variable String): "hello, world!\0"

AnotherTable
...

Our parser produced object instances representing each table. And I used
normal references between tables instead of pointers. 

You had to start parsing at the beginning of the file (where the root
table is), otherwise you have no clue where in the structure you are.
Because all tables after the root table are dynamically positioned when
the whole thing is serialized. 
So, to parse the root, you read the first byte, check that it's 0x50,
read the next two bytes (which are the version number), then you read
four bytes (pointer to the sub table). The sub table starts at byte 200
of the file. So now I would jump to that position in the file and start
parsing SubTable. After that I'd jump back and parse the remaining
fields of the root table.

On Sun, 2010-03-14 at 16:23 +0000, Stephen Tetley wrote:
> On 14 March 2010 16:03, david fries <djf at gmx.ch> wrote:
> [SNIP]
> 
> > Oddly enough, our customer never bothered to write a parser of their
> > own. I wonder why.
> 
> Hi David
> 
> If the binary structure was previously used only with C programs its
> quite common just to use casting to unpack the data into a struct -
> though your example seems to suggest this wasn't being done as the
> format had both big and little endian tables.
> 
> In Haskell or other modern functional languages like SML, parse trees
> are generally represented as algebraic types - so there are no
> pointers. If you're familiar with ANTLR from the OO world, its rather
> like working with the tree definition automatically as opposed to
> generating classes from the data description language.
> 
> 
> Best wishes
> 
> Stephen
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe




More information about the Haskell-Cafe mailing list