[Haskell-cafe] Data.Binary questions

Lauri Pesonen lauri.pesonen at iki.fi
Sun Jan 20 13:18:08 EST 2008


Hi,

I'm relatively new to Haskell so please bear with me. I'm trying to
parse Java class files with Data.Binary and I'm having a few problems:

(The class file format is described here:
http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html
and the bytecode instructions are described here:
http://java.sun.com/docs/books/jvms/second_edition/html/Instructions.doc.html
)

1. The class file format contains a number of tables. The table
definitions start with the length of the list and carry on with that
many table entries. Lists would be a good representation for them in
Haskell, because  there is not need to index them directly (except
with the constants table). I've created my own list type so that I can
redefine the serialisation functions for it so that the serialisation
matches the format defined in the class file format:

newtype MyList e = MkList ([e])
        deriving Show

instance (Binary e) => Binary (MyList e) where
         put (MkList es) = do
                       put (fromIntegral (length es) :: Word16)
                       mapM_ put es

         get = do
             n <- get :: Get Word16
             xs <- replicateM (fromIntegral n) get
             return (MkList xs)

The problem is that one of the tables, namely the attribute_info
structures, use a u32 length field whereas all the other tables use a
u16 length field. My implementation uses u16, but it would be nice to
be able to use the same data type for both types of tables. I think I
can do it by adding a lenght field to MyList and specifying the type
when I use MyList in some other data structure, but that would also
mean that I have to keep track of the length of the list manually?

I'm basically copy-pasting the same code just to use a u32 length
field in the serialised form:

data Info = MkInfo [Word8]
     deriving Show

instance Binary Info where
         put (MkInfo xs) = do
                      put (fromIntegral (length xs) :: Word32)
                      mapM_ put xs

         get = do
             n <- get :: Get Word32
             xs <- replicateM (fromIntegral n) get
             return (MkInfo xs)


2. This is the bigger problem. The Java class file contains a
constants table in the beginning of the file. The other fields later
on in the class file contain indexes that reference entries in that
constants table. So in order to be able to replace an index in a data
structure with the actual string, I need to be able to look up the
string from the constants table while I'm deserialising the field.

My guess is that I should be able to put the constants table into a
state monad. On the other hand Data.Binary already uses the state
monad for holding onto the binary data being deserialised. So it's not
clear to me if I can use StateT with Data.Binary.Get? And if not, can
I implement my own state monad and do it that way? I'm not very
comfortable with Monads yet, so I might be missing something very
obvious.

This is what the get function looks like in my top-level Data.Binary instance:

         get = do magic <- get :: Get Word32
                  case magic == magicNumber of -- class files start
with 0xCAFEBABE
                       True -> do min <- get -- minor version number u16
                                  maj <- get -- major version number u16
                                  c <- get -- constants table
                                  a <- get -- access flags (public,
abstract, ...) u16
                                  t <- get -- a u16 index pointing to
the name of this class in the constants table
                                  s <- get -- a u16 index pointing to
the name of the super class in the constants table
                                  i <- get -- a table of interfaces
                                  f <- get -- a table of fields
                                  m <- get -- a table of methods
                                  attrs <- get -- a table of class
level attributes
                                  return (ClassFile (min, maj) c a t s
i f m attrs)
                       False -> error "Invalid magic number"

Thanks for all the help!

-- 
  ! Lauri


More information about the Haskell-Cafe mailing list