Proposal for Data.Graph : Improve graph creation complexity when nodes have "consecutive" keys

Ivan Lazar Miljenovic ivan.miljenovic at gmail.com
Mon Apr 2 12:34:56 UTC 2018


On 2 April 2018 at 21:30, Olivier S. <olivier.sohn at gmail.com> wrote:
>
> Hello,
>
> I'm resending this proposal, which is simplified w.r.t the first one, and
> where I removed a wrong analysis of a benchmark.
>
> - Proposal I : Optimize the time complexity of (key -> Maybe Vertex) lookups
> and graph creation when keys are Integral and consecutive.
>
> (The related PR for proposal I is [1], including benchmarks showing the
> performance improvements.)
>
> Currently, (key -> Maybe Vertex) lookups returned by graphFromEdges consist
> of a binary search on an array, with a time complexity of O(log V) (I will
> use V for "Count of vertices", E for "Count of edges").

At the risk of bikeshedding, can you please use |V| and |E| to refer
to the order and size of the graph respectively?

> When key is Integral, and keys (of nodes passed to the graph creation
> function) form a set of /consecutive/ values (for example : [4,5,6,7] or
> [5,6,4,7]), we can have an O(1) lookup by substracting the value of the
> smallest key, and checking for bounds.

I'm not sure I follow this part; are you ignoring order in these lists
(you're referring to sets but using list notation)?

>
> Hence, graph creation complexity is improved, and user algorithms using (key
> -> Maybe Vertex) lookups will see their complexity reduced by a factor of
> up-to O(log V).
>
> The PR introduces this lookup and uses it for functions
> graphFromEdgesWithConsecutiveKeys and graphFromEdgesWithConsecutiveAscKeys.
>
> Here is a summary of complexities for (graph creation, lookup function) as
> they stand in the current state of the PR:
>
> - graphFromEdges (the currently existing function):
> O( (V+E) * log V ), O(log V)
> - graphFromEdgesWithConsecutiveKeys (new function):
> O( E + (V*log V) ), O(1)
> - graphFromEdgesWithConsecutiveAscKeys (new function) :
> O( V+E ), O(1)
>
> - Proposal II : Deprecate `graphFromEdges` taking [(node, key, [key])] in
> favor of `graphFromMap` taking (Map key (node,[key]))
>
> If we pass the same key twice in the list we pass to 'graphFromEdges' it is
> undefined which node for that key will actually be used.
> Introducing 'graphFromMap', taking a (Map key (node,[key]) would alleviate
> this issue, through the type used.

Off the top of my head, I'm not a big fan of this.  If we're going to
improve this, then I'd prefer to do so in such a way that allowed for
usage with IntMap (is there an existing type-class that covers
association list-style data structures?).  Ideally you could also use
HashMap from unordered-containers as well, but since we ultimately
want `type Vertex = Int` I'm not sure if that's worth it; IntMap,
however, is.

>
> Also, using a Map makes the implementation a bit more "natural" : there is
> no need for sorting by key, as Map.toAscList gives exactly the sorted list
> we want.
>
> We could also deprecate graphFromEdgesWithConsecutiveKeys and
> graphFromEdgesWithConsecutiveAscKeys (introduced in proposal I) in favor of
> graphFromConsecutiveMap.
>
> About the naming, I propose two different schemes:
>
> Either:
>     - graphFromEdges                 (takes a List, deprecated, existing
> function)
>     - graphFromEdgesInMap            (takes a Map)
>     - graphFromEdgesInConsecutiveMap (takes a Map with consecutive keys)
> Or:
>     - graphFromEdges                 (takes a List, deprecated, existing
> function)
>     - graphFromMap
>     - graphFromConsecutiveMap
>  with these, to reflect the Map / List duality in the naming scheme:
>     - graphFromList               (takes a List, deprecated, redirects to
> graphFromEdges)
>     - graphFromConsecutiveList    (takes a List, deprecated, redirects to
> graphFromEdgesWithConsecutiveKeys)
>     - graphFromConsecutiveAscList (takes a List, deprecated, redirects to
> graphFromEdgesWithConsecutiveAscKeys)
>
> Cheers,
> Olivier Sohn
>
> [1] https://github.com/haskell/containers/pull/549
>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>



-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
http://IvanMiljenovic.wordpress.com


More information about the Libraries mailing list