[Haskell-cafe] Parsing binary 'hierachical' objects for lazy developers

Maciej Marcin Piechotka uzytkownik2 at gmail.com
Wed Apr 27 21:46:08 CEST 2011


On Wed, 2011-04-27 at 20:16 +0200, John Obbele wrote:
> Hi Haskellers,
> 
> 
> I'm currently serializing / unserializing a bunch of bytestrings
> which are somehow related to each others and I'm wondering if
> there was a way in Haskell to ease my pain.
> 
> The first thing I'm looking for, is to be able to automatically
> derive "Serializable" objects, for example:
> 
> ---------------------------------------------------------------
> import Data.Serialize -- using cereal as an example
> 
> data MyFlag = One | Two | Three
> 
> instance Serialize [MyFlag] where
>     put = putWord16le . marshalFlags
>     get = unmarshal `fmap` getWord16le
> 
> data ObjectA = ObjectA { attribute0 :: Word8
>                        , attribute1 :: Word16le
>                        , attribute2 :: [MyFlag]
>                        } deriving (Serialize) -- magic goes here!
> ---------------------------------------------------------------
> 
> Unfortunately ghci complains that 'Serialize' is not a derivable
> class. Yet, deriving the Serialize instance for ObjectA should be
> simple, since all the three attributes are already serializable
> themselves...
> 
> 
> 
> Second issue, I would like to find a way to dispatch parsers. I'm
> not very good at expressing my problem in english, so I will use
> another code example:
> 
> ---------------------------------------------------------------
> -- let's say we have two objects with almost the same structure:
> data ObjectA = ObjectA { objLength   :: Int
>                        , objType     :: TypeId
>                        , attribute2a :: [MyFlag]
>                        }
> 
> data ObjectB = ObjectB { objLength   :: Int
>                        , objType     :: TypeId
>                        , attribute2b :: Word32le
>                        }
> ---------------------------------------------------------------
> 
> When we begin to deserialize theses objects, we don't know their
> final type, we just know how to read their length and their
> typeId.
> 
> Only then can we determine if what we are parsing is an ObjectA
> or an ObjectB.
> 
> Once we now the object type, we can resume the parsing and return
> either an ObjectA or ObjectB.
> 
> Oki, so I may have read too much of Peter Seibel's chapter on
> binary-data parsing in Common Lisp or spent too much time working
> on object-oriented code, but currently, I have no idea on how to
> write this 'simply' in Haskell :(
> 

I believe following should work

class Serializer ObjectA where
    get = check =<< (ObjectA <$> get <*> get <*> get)
          where check obj@(ObjectA len id attr)
                  | len < 10 && id == 0 = return obj
                  | otherwise = empty

class Serializer ObjectB where
    get = check =<< (ObjectB <$> get <*> get <*> get)
          where check obj@(ObjectB len id attr)
                  | len > 10 && id == 1 = return obj
                  | otherwise = empty

parseEitherAB :: Get (Either ObjectA ObjectB)
parseEitherAB = (Left <$> get) <|> (Right <$> get)

Regards

> 
> any help would be welcome
> /john


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110427/c1ae43b8/attachment.pgp>


More information about the Haskell-Cafe mailing list