Representing cyclic data structures efficiently in Haskell

Graham Klyne GK@ninebynine.org
Mon, 07 Jul 2003 19:03:25 +0100


Lazy evaluation means that you can embed nodes directly in a cyclic 
structure, even though that would appear to create an indefinite 
regression.  In order to detect cycles, one might add a node 
identifier.  Here's a simple example:

[[
-- spike-cyclicstructure.hs

data Node = Node { nodeid :: String, adjacent :: [Node] }

instance Eq Node where n1 == n2 = nodeid n1 == nodeid n2

instance Show Node where show = nodeid

findPaths :: Node -> Node -> [[Node]]
findPaths fromNode toNode = map reverse $ findPaths1 [] fromNode toNode

findPaths1 :: [Node] -> Node -> Node -> [[Node]]
findPaths1 beenThere fromNode toNode
     | fromNode == toNode = [fromNode:beenThere]
     | otherwise = concat [ findPaths1 (fromNode:beenThere) nextNode toNode
                          | nextNode <- adjacent fromNode
                          , not ( nextNode `elem` beenThere ) ]

n1 = Node "n1" [n2,n3,n4]
n2 = Node "n2" [n1,n3,n4]
n3 = Node "n3" [n1,n2]
n4 = Node "n4" [n1,n2]

test1 = findPaths n1 n2 -- [[n1,n2],[n1,n3,n2],[n1,n4,n2]]
test2 = findPaths n1 n3 -- [[n1,n2,n3],[n1,n3],[n1,n4,n2,n3]]
]]

One (crude) way to think about this is that a mention of a value in haskell 
behaves in some ways like a pointer to that value in (say) C.

If you don't want to assign unique identifiers by hand, you could use a 
state transformer monad to generate them for you.

#g
--

At 16:37 07/07/03 +0100, Sarah Thompson wrote:

>> From reading your message, it seems you want a graph library of sorts.
>The important bit is 'of sorts', unfortunately! Most graph libraries I've 
>seen to date (although I must admit that they have been written in C and 
>C++, including one I perpetrated personally) don't really work that well 
>for circuits because, for anything other than trivially simple components, 
>the connections between nodes need to be labelled.
>
>A component representing something simple like (say) an 'and' gate would 
>be easy enough to represent as a node of an ordinary directed graph. In 
>such a case, an incoming arrow would always be an input of the gate, and 
>an outgoing arrow would always be the output. I need to be able to 
>represent more complex components, up to and including the complexity of 
>things like CPUs, where you might have lots and lots of inputs and outputs 
>with substantially different meanings. In the case of a CPU, it would 
>never be acceptable to (for example) swap a chip select output for an 
>address or data line.
>
>This makes me think that an off-the-shelf graph library won't quite be 
>what I'm after, so I will almost certainly need to write the code. What 
>I'm more interested in at the moment is a view on the best kinds of 
>approach that might be appropriate.
>
>I'd considered something like embedding the type in the IO monad, with 
>links between components implemented as IORefs, but I'd really prefer 
>something cleaner (pure functional). If the code ends up horribly 
>complicated in order to get decent performance, I might as well fold and 
>use C++ instead.
>
>I'm currently wondering about going for an adjacency list approach (as 
>used by most graph libraries), with the tuples in the adjacency list 
>extended to include input/output labels, resulting in a 4-ary tuple rather 
>than the usual 2-ary. But, I don't want to be stuck with representing this 
>as a list -- I really need log N lookups or performance will stink 
>horribly as circuits get big. Maybe a pair of finite maps, allowing the 
>edges to be traversed in either direction?
>
>
>>If FGL seems like overkill to you, I have my own little mini graph
>>library which basically looks like:
>>
>>  - An ADT, Node:  newtype Node = Node Int deriving ...
>>  - The 'Graph n e' data structure, containing:
>>      - a FiniteMap from n -> Node
>>      - an (growable?) array of type IOArray (Node,Node) e
>>
>>the array represents the edges in the top half (i do some index fiddling
>>to get this efficient) and the finitemap represents the node labellings.
>>It's a pretty stupid interface and works a lot better when the graphs
>>are fixed size (that way you don't need to grow the array).  But it
>>works.
>>
>The finite map from n to Node seems like a good idea. Does representing 
>the adjacency list as a flat array cause any problems?
>
>Sarah
>
>--
>     ----------------------------------------------
>    / __            + / Sarah Thompson      **** /
>   / (_  _  _ _ |_   /                        * /
>  /  __)(_|| (_|| ) / sarah@telergy.com      * /
>/ +               / http://findatlantis.com/ /
>----------------------------------------------
>
>
>
>
>--
>      ----------------------------------------------
>     / __            + / Sarah Thompson      **** /
>    / (_  _  _ _ |_   /                        * /
>   /  __)(_|| (_|| ) / sarah@telergy.com      * /
>  / +               / http://findatlantis.com/ /
>----------------------------------------------
>
>
>_______________________________________________
>Haskell-Cafe mailing list
>Haskell-Cafe@haskell.org
>http://www.haskell.org/mailman/listinfo/haskell-cafe

-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E