GHC might well be able to make use of such stuff too.  In general, one would like to be able to treat a file much like a database, as you suggest, with binary serialisation of data structures into it.

GHC's serialisation also includes a simple communing-up mechanism for "leaves", especially strings.  We build a kind of dictionary, to avoid repeatedly re-serialising the same string.  I guess that any good binary serialisation will want to do something similar.  (Or something more dynamic, a la arithmetic coding.)


| Interesting, A big bottleneck in jhc right now is reading the (quite
| large) binary ho and hl files on startup. a few things I have wanted out
| of a binary library are:
|  * the ability to create a hash of the structure of the underlying data
|    type, to verify you are reading data in the right format.
|  * extensible type-indexed sets (implemented hackily in Info.Binary in
|    jhc)
|  * being able to jump over unneeded data, as in go directly to the 112th
|    record, or the third field in a data structure without having to
|    slurp through everything that came before it.
|  * VSDB[1] style ACID updates as an option.
|  * VSDB style write-time optimized constant hash table. I don't mind
|    spending extra time when writing library files to speed up their
|    usage.
|  * mmap based reading.
| I was going to get around to writing this sometime, but perhaps there is
| room for a collaborative project in there. Is your code available
| somewhere bulat?
|         John
| [1] VSDB is my very simple database that ensures full ACID semantics using
|   just the file guarentees of unix, including the weaker guarentees of
|   NFS.
|   Sort of like STM on the filesystem.
