Representing cyclic data structures efficiently in Haskell

Lex Stein stein@eecs.harvard.edu
Mon, 7 Jul 2003 11:22:16 -0400 (EDT)


Why not just use a GUID to represent each node. Then a C pointer just
becomes a GUID in a tuple alongside the node. The C pointer approach is
just a method to have the memory system dole out GUIDs. And it can be
generalised. Then all the tuples could be inserted in a hash table on the
GUID component. Edges are then tuples of two GUIDs.  These could be stored
in some other data structure.

Another approach would be to have a type (node * (guid list))  (excuse the
Ocaml syntax). Put elements of this type in a big array. The GUID of a
node (guid) is its array index. Each node is accompanied in the tuple by a
list of nodes that it points to.

You know your workload and whether you have advance knowledge of the
number of nodes, so you are in the position to compare the efficiency of
these 2 approaches.

Good luck
Lex

--
Lex Stein                     http://www.eecs.harvard.edu/~stein/
stein@eecs.harvard.edu        TEL: 617-233-0246

On Mon, 7 Jul 2003, Sarah Thompson wrote:

> What is the best way to represent cyclic data structures in Haskell?
>
> Lists are trivial. Trees are trivial. But what if the thing you're
> trying to represent is shaped like an electronic circuit, with a
> (possibly large) number of nodes connected by multiple arrows to other
> nodes? In an imperative language like C++, I'd probably just model the
> circuit directly, with objects representing nodes and pointers
> representing links between nodes. A good way to model this in Haskell is
> less obvious, however.
>
> A simplistic approach would be to represent such a structure as a list
> of nodes, where each node contained a list of other connected nodes.
> Containing the actual nodes within the sub-lists probably isn't feasible
> -- some kind of reference would be necessary in order to get around
> chicken-and-egg problems with instantiating the data structure. But, in
> this case, it's not so easy to see how one could 'walk' the structure
> efficiently, because moving from node to node would involve quite
> horrible (possibly slow) lookups. In C++, I'd simply follow a pointer,
> which would be an extremely fast operation.
>
> I would also need to implement efficient indexes into the data structure
> -- flat searching will be too slow for non-trivial amounts of data. In
> C++, I'd implement one or more external (probably STL-based) indexes
> that point into the main data structure.
>
> How do people typically solve this problem at the moment? I know a few
> people out there have written hardware compilers in Haskell and similar
> languages, so there should be some experience of this in the community.
> I can probably work something out soon enough, but reinventing the wheel
> is never a great idea.
>
> Thank you in advance,
>
> --
>       ----------------------------------------------
>      / __            + / Sarah Thompson      **** /
>     / (_  _  _ _ |_   /                        * /
>    /  __)(_|| (_|| ) / sarah@telergy.com      * /
>   / +               / http://findatlantis.com/ /
>  ----------------------------------------------
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>