[Haskell-cafe] serialized data structures (Was: Generalized, named, and exportable default declarations)

Olaf Klinke olf at aatal-apotheke.de
Wed Apr 14 12:01:43 UTC 2021


On Tue, 2021-04-13 at 22:23 +0200, Henning Thielemann wrote:
> On Tue, 13 Apr 2021, Olaf Klinke wrote:
> 
> > > What if we would not complicate the language and generalize syntactic 
> > > sugar for Text, but instead improve data layout for all Haskell types and 
> > > eventually make a custom Text type unnecessary?
> > 
> > So essentially Gibbon's aim is to make String work like Text under the
> > hood, without me having to worry about it? +1 for that!
> 
> In principle yes.
> 
> I think the idea is the following: Linked data structures were invented 
> decades ago in order to easily perform in-place updates, e.g. insert and 
> remove elements from lists and trees. However, in Haskell we do not allow 
> in-place modifications and copying a data structure is not much more 
> expensive than traversing it. But if we copy the whole tree anyway why do 
> we still need pointers? We have to acknowledge that linked lists come with 
> some costs. A linked list with elements spread across memory have 
> cache-unfriendly access patterns. In contrast to that a serialized tree is 
> compact in memory and very cache friendly. You can still access a subtree 
> without copying.
> 
> Yet, linked data structures allow sharing of sub-trees or common list 
> tails. For this case we still need pointers. But maybe not as many as 
> today. Gibbon solves this by using two alternative internal identifiers 
> for every data constructor: One for an embedded subtree and one for a 
> pointer to a subtree.
> 
> This way a list or a String could be hold in one memory chunk or it could 
> be effectively a linked list of chunks or a linked list of single 
> characters if required.

Aha. So when I bind
ys = tail xs
where xs is compactly represented in memory, then the memory
representation of xs is broken up into (head xs : ys) where ys is still
compactly represented? I'm thinking procedurally here, which is
probably not the way the compliler handles things. How does this relate
to "compact regions"? I remember an announcement to this list not so
long ago. 

Olaf



More information about the Haskell-Cafe mailing list