[Haskell-cafe] Help with Binary Serialization

Lennart Kolmodin kolmodin at gmail.com
Tue Jan 21 06:01:10 UTC 2014


Hi,

the format is quite simple, and it's the same in both binary and cereal.

For data types with only one constructor, only the values of the 
constructor are encoded - not the constructor itself.

data Foo = Foo Int Int deriving (Generic)
So Foo is encoded as if it was only the Ints without the Foo constructor.
encode (Foo 1 2) = 0x 00000001 00000002

If there are multiple constructors a tag is encoded representing the 
constructor, and then the values of that constructor. The tag will use as 
many bytes as it needs to be to fit the tag. 2-255 constructors will use 1 
byte, and so on. 

data Fruit = Apple Int | Orange Foo

encode (Apple 3) = 0x 00 00000003
First byte is zero to represent the Apple constructor, then the Int.

encode (Orange (Foo 4 5)) = 0x 01 00000004 00000005
First byte is 0x01 to represent the Orange constructor, then the Foo value 
which is just to subsequent Ints.

The format will not change without warning, and has not changed since it 
was implemented.
That said, the generic format is mostly meant for being written and read by 
the haskell app that defines the data types, as it's easy to break 
compatibility even within the same app.

JSON is probably not a bad choice, try it out 
http://hackage.haskell.org/package/aeson-0.6.1.0/docs/Data-Aeson.html

Hope it helps!
Lennart

On Monday, 20 January 2014 21:28:33 UTC+4, Joey Eremondi wrote:
>
> Makes sense. I've had trouble finding documentation on the format used by 
> aeson, any links to that? 
> On 2014-01-19 10:46 AM, "John Lato" <jwl... at gmail.com <javascript:>> 
> wrote:
>
>> I think this approach will likely lead to problems later on. First, 
>> undocumented formats are liable to change without warning. Secondly, it's 
>> conceivable that the format could change due to changes in ghc's generics 
>> support, or internal (and hence unstable) data structures of some component 
>> type. 
>>
>> Would it be possible to just define your own format instead, or use 
>> something like JSON that's already well-defined? 
>>
>> On Jan 18, 2014 12:17 PM, "Joey Eremondi" <jmit... at gmail.com<javascript:>> 
>> wrote:
>> >
>> > I was wondering if somebody could talk me through the default derived 
>> format for binary serialization used, either by binary or by cereal.
>> >
>> > I'm trying to share data between Haskell and another function language 
>> (Elm) which also supports algebraic data types, so the conversion of data 
>> should be pretty trivial. I'd like to be able to just derive encode and 
>> decode in Haskell using either binary/cereal, and then write a parser for 
>> the same format in Elm. The trick is, I don't know what that format is.
>> >
>> > Is there any documentation on it, or if not, is anybody familiar enough 
>> with it that they could explain the format to me?
>> >
>> > Thanks!
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > Haskel... at haskell.org <javascript:>
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>>  
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20140120/c118ca12/attachment.html>


More information about the Haskell-Cafe mailing list