[Haskell-cafe] Re: bizarre memory usage with data.binary

Jules Bean jules at jellybean.co.uk
Wed Oct 3 15:03:15 EDT 2007


Spencer Janssen wrote:
> On Tuesday 02 October 2007 19:51:47 Anatoly Yakovenko wrote:
>>> If its specifically the list instance, where we currently trade laziness
>>> for efficiency of encoding (which may or may not be the right thing),
>>> I'd suggest a fully lazy encoding instance?
>> Its not really a list, its more of a tree that has shared nodes, so
>> something like this:
>>
>>         A
>>      /     \
>>     B      C
>>       \   /
>>        D
>>      /   \
>>     E    F
>>
>> I suspect that maybe after encode/decode i end up with something like
>>
>>         A
>>      /     \
>>     B      C
>>    /          \
>>   D           D
>>  /   \        /   \
>> E    F    E     F
> 
> That is correct, binary doesn't attempt to share substructures.  If you'd like
> to do this, you'll need to do it by hand.

...and indeed it can't be done, except by the naive brute-force method 
of comparing every subtree, possibly optimised by cryptographically 
hashing a representation of every subtree, since sharing isn't an 
observable property.

Of course, hashing doesn't actually observe the real sharing present, 
rather, it computes maximal sharing. There are some applications where 
this could be a worthwhile win.

Jules

PS Well, except unsafePtrEquality but I don't really want to go there...



More information about the Haskell-Cafe mailing list