ANNOUNCE: protocol-buffers-0.2.9 for Haskell is ready

ChrisK haskell at
Sat Sep 20 14:06:50 EDT 2008

Hello one and all,

Amid much rejoicing, my Haskell version of protocol-buffer is now
released (version 0.2.9).

What is this for?  What does it do?  Why?

   Shorter answer: It generates Haskell data types that can be converted back 
and forth to lazy ByteStrings that interoperate with Google's generated code in 

   It is a pure Haskell re-implementation of the Google code at
   which is "...a language-neutral, platform-neutral, extensible way of 
serializing structured data for use in communications protocols, data storage, 
and more."
   Google's project produces C++, Java, and Python code.  This one produces 
Haskell code.

The release tarball (with 3 Haskell packages inside, see README in source) is at

The darcs repository has moved to

You will also need a recent ghc compiler, the "binary" package and the
"utf8-string" package from (same site as mentioned

The source compiles to 3 things:
   1) the package "protocol-buffers" with the library API
   2) the package "protocol-buffers-descriptor" with the
descriptor.proto code
   3) The 'hprotoc' executable which is a command line program similar
to 'protoc'.

The "examples" sub-directory in the code has the Haskell version of
the "addressbook.proto" example and is compatible with Google's
similar example code.

The code generated from unittest.proto (and unittest_import.proto)
includes messages TestAllTypes and TestAllExtensions which have been
extensively tested by QuickCheck to ensure they can be wire encoded
and decoded (see the "tests" sub-directory in the code).

The user API, as exported by Text.ProtocolBuffers, allows for
converting messages back and forth to the lazy ByteString type.  And
such messages can be merged, and the defaults accessed via the
MessageAPI type class.

The messages in Haskell as just regular data types and are thus
immutable.  Required types are simple record fields, optional types
are Maybe, and repeated types are Seq (from Data.Sequence).

Extensions are supported via Key data that allows access to the
extension fields.  Extensible messages contain an opaque ext'field
entry of type ExtField that contains the map data structure to contain
the extension field values.

The User API allows for serializing messages as the usual series of
fields.  It also provides for a length prefix to be written to create
delimited messages.  It also provides to write a wire tag with any
field number before the length and message data.  This last form looks
like a field on the wire, and there is a special api call to read back
just the one message and its field number.  This last API is similar
to the one that is part of the C# API.

No benchmarks have been run yet.  Any suggestions?

Unsupported for the moment is loading and storing "unknown" fields.
It can be added sooner if someone has a use for this.

Unsupported indefinitely is code generation for Services and Methods.
I have yet to look into how this is presented in the other languages.

The API to read a single message field, as mentioned above, might be
extended to read any type instead of just messages.

optional clever_quote {
<autrijus> Perl: "Easy things are easy, hard things are possible"
<autrijus> Haskell: "Hard things are easy, the impossible just


   Chris Kuklewicz

More information about the Libraries mailing list