# [Haskell-cafe] Imperative graph algorithm, (tail?) recursion, and speed

Jeffrey Brown jeffbrown.the at gmail.com
Fri Mar 6 17:17:26 UTC 2015

```Indeed, I did look at FGL. It's beautiful. It has a bug that does not
permit multiedges in directed graphs, which I need. More importantly,
though, I'm just not ready for it. I have hardly written anything bigger
than a screenful of text in Haskell. If I try using a lot of things I don't
yet understand in addition to the base language I get discouraged.

On Fri, Mar 6, 2015 at 2:25 AM, Benjamin Edwards <edwards.benj at gmail.com>
wrote:

> Have you looked at fgl? I would also read the paper that went with the
> library. It has some nice notions of graph decomposition, and reading the
> source for bfs et al should give you a good idea of how to mix in state
> like 'what have I visited'.
>
> Ben
>
> On Fri, 6 Mar 2015 2:46 am Jeffrey Brown <jeffbrown.the at gmail.com> wrote:
>
>> Suppose I have a general (not a tree) directed graph, and I would like to
>> find all the "pure descendents" of a node N -- that is, all nodes M for
>> which (1) N is an ancestor of M, and (2) every partial or maximal sequence
>> of predecessors starting from M either consists entirely of descendents of
>> N, or else includes N itself.
>>
>> I have an imperative algorithm (below) for doing that. I want to know how
>> hard it would be to implement in Haskell. It's complex, but basically it
>> sets up a few lists like "members of this list have had all their
>> descendents processed", "members of this list have been processed but their
>> descendents have not been", "members of this list have not been processed
>> at all", etc. Then I iterate through the graph, moving nodes from one
>> collection to the other until the "to be processed" list is empty.
>>
>> I would like a function Graph -> Graph that does that and does not need
>> to rely on a monad. The only way I can think of involves a few mutually
>> recursive functions, each of which passes not only the original graph of
>> interest, but all the lists described in the previous paragraph, around to
>> the others. I don't understand Haskell's evaluation strategy well enough to
>> anticipate the result, but my vague sense is that a lot of copies of the
>> same data would be floating around, ruining the speed of it.
>>
>> Python code for the algorithm:
>>   def descendedOnlyFrom(gset):
>>     "All Gnodes n for which every ancestry of n includes a Gnode in gset
>> (which can be a set or a list)."
>>     # For efficiency improvement ideas and verbose comments,
>>       # see treeSearch/all.py/node.descendedOnlyFrom
>>     if 1: # variables
>>       # "pure" = "descended only from the calling Glist"
>>       pe = set(gset) # determined Pure, yet to Explore children of
>>       pf = set() # determined Pure, Finished exploring children of
>>       pb = set(gset) # determined Pure, Both kinds: pe | pf
>>       ud = set() # purity UnDetermined
>>         # descended from root, but might have parents outside of pb
>>       udp = {} # "Parents of the UnDetermined"
>>         # This is a dictionary, from gnodes to sets of gnodes.
>>         # The other four variables are sets.
>>     while pe:
>>       while pe:
>>         k = 1
>>         for c in set(i._children()):
>>           if c in pb | ud: continue # If already handled, do not repeat.
>>           if set(c._parents()) <= pb:
>>           else:
>>             ud.add(c);    udp[c] = set(c._parents()) - pb
>>       for i in ud:
>>         ipf = udp[i] & pb # (i)'s (p)arents newly (f)ound in pb
>>         if ipf:
>>           udp[i] -= ipf
>>           if udp[i]==set():
>>             ud.remove(i);    del( udp[i] )
>>             break
>>     return pb
>> _______________________________________________