[Haskell-cafe] What do _you_ want to see in FGL?

Mon May 10 09:25:10 EDT 2010

Henning Thielemann <schlepptop at henning-thielemann.de> writes:

> Ivan Lazar Miljenovic schrieb:
>
>> Pros for allowing you to use a custom node type:
>> * Matches your data better
>> * No need for extra lookup maps when converting your data to FGL form
>> 
>> Cons:
>> * Makes type-sigs uglier/more verbose
>
> Unlabelled graphs with custom node type would have only one type
> parameter. :-)

Not quite: we'd be using ATs, so you'd have to have functions that have
something like "Show (Vertex g)" as a constraint in their type
signatures.

>> * Restricts the ability to optimise
>
> That's my question. As far as I know the set of node identifiers in a
> graph is not contiguous, thus you cannot use Arrays but you must use
> IntMap or so.

We're considering a vector-backed instance (we're not quite sure how
this will cope with match and &, but using a rope approach might work).

>> Using Int gives us a fixed-size data type with known good comparison
>> performance and with a range that should suit most purposes.
>
> Replacing IntMap by Map is not much slower, is it?

IntMap is faster than (Map Int) (which is why PatriciaTree has better
performance than Tree).

>> * It's easier to make a labelled graph act as an unlabelled one than the
>>   other way around.
>
> Sure, it's the same argument that in MatLab everything is a
> complex-valued matrix. They even represent Bool by a 1x1 complex-valued
> matrix. Possible, but clean? From today's viewpoint separation of
> labelled and unlabelled graphs is additional work, but I'm afraid there
> will arise problems with this design. Unfortunately, I can't tell them
> today.

Well, we'd need to have duplicate classes, duplicate instances,
duplicate graph operation functions, etc.  And for what?  To avoid
typing "()" a few times?

>> The current "type Node = Int" alias is only there to provide a unique
>> index type for referencing nodes; the actual Ints don't really represent
>> the nodes IMHO, the labels do (in conjunction with the index for
>> colouring, etc.).  In your example case of using PkgName (I assume you
>> mean PackageName or PackageIdentifier?), can you guarantee that each
>> PkgName value is unique before you go blindly using it (what about
>> having the same library installed in both the global and user
>> database?)?
>
> Of course you must choose data as key, that is actually a key. But this
> is not the problem of FGL. And with Node = Int, I also have to choose a
> key for my dictionary (Map Something Node).
> In my case I consider packages that are to be installed, so no
> distinction between globally and locally installed packages. I want
> really the package name as key.

So if I have "network-2.2.1.7" installed in both global package ID and
user package ID, what happens?  If I have two different versions of
network installed, what happens?

One of the things I'm considering doing with FGL is porting some of the
stuff from Graphalyze over, such that the actual user-level stuff just
considers the labels, and only the underlying functions deal with the
vertex type.

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
IvanMiljenovic.wordpress.com