[Haskell-cafe] Graph diagram tools?

Fri Apr 17 23:23:59 UTC 2015

On 18 April 2015 at 01:48, Ivan Zakharyaschev <imz at altlinux.org> wrote:
> Hello!
>
> 2015-04-17 4:23 UTC+03:00, Ivan Lazar Miljenovic <ivan.miljenovic at gmail.com>:
>
>>> ### Considering dotgen vs graphviz closer
>>>
>>> But looking into the examples, I see that `dotgen` can use "Haskell
>>> ids" to identify created nodes, whereas in graphviz's monad (see the
>
> To bring more clear context for any readers, I put here a short
> excerpt from that dotgen example:
>
>>>     refSpec <- src "S"
>>>     c1 <- box "S"
>>>     refSpec .->. c1

It should be noted that src and box are custom functions and not part of dotgen.

>
>
>>> example above) one must supply extra strings as the unique ids (by
>>> which we refer to the nodes).
>
> Short example:
>
>>>         "start" --> "a0"
>>>
>>>         node "start" [shape MDiamond]
>
>
>> I used Strings as an example, as I was directly converting an existing
>> piece of Dot code; the original can be found here:
>> http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-Types.html
>>
>> But, you can use any type you like for the node identifiers, as long
>> as you make them an instance of the PrintDot class.  That's where the
>> `n` in the `Dot n` type comes in.
>
> Ok, thanks for the valuable information!
>
>>> I like the first approach more ("Haskell ids").
>>
>> I admittedly don't have any ability in graphviz to create new
>> identifiers for you.  I could (just add a StateT to the internal
>> monadic stack which keeps track of the next unused node identifier)
>
> Since the API is already monadic, adding another monad into the stack
> wouldn't impose big difficulties for the users of the API, because they
> won't need to restructure the code (as if it were a transition from
> some pure functional code into monadic).

Sure, this bit itself isn't a problem.

>
>> but I think that would _reduce_ the flexibility of being able to use
>> your own type (it would either only work for `Dot Int`, or even if you
>> could apply a mapping function to use something like `GraphID`, but
>> that has a problem if you have a `Double` with the same value - and
>> hence same textual representation - as your Int).
>
> I see:
> [GraphID](http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-Types.html#t:GraphID)
> can have distinct values with the same textual representation.
>
> But if we are thinking about automatically creating new IDs, then this
> problem can simply be treated in the code for tracking which IDs have
> already been used.

Possibly a bit more complicated than its worth: "OK, when I convert
this ID to a textual one it appears to be the same as one we've
already seen" would require a lot more bookkeeping, and won't help
prevent errors from explicit user-defined node IDs defined later
(unless we also use some from of backwards state to check for that as
well).

>
> There could be two APIs: a "flexible" one with user-supplied IDs, and
> an "automatic" API. The "automatic" one is implemented on top of the
> "flexible" one.
>
>> The way I see it, graphviz is usually used for converting existing
>> Haskell values into Dot code and then processing with dot, neato, etc.
>
>> My preference - and hence overall design with graphviz - is that you
>> would generate the graph first, and _then_ convert it to a Dot
>> representation en masse.
>
> If the Haskell representation of the graph doesn't already have unique
> IDs for the nodes, then such an "automatic" layer would be useful as
> an intermediate step in the conversion. So it seems it won't be
> useless even in your standard scenarios.
>
> ***
>
> You name flexibility for the user as an advantage of the existing
> approach. As for some advantages of the other approach (with using
> Haskell ids for the nodes): the compiler could catch more errors.
>
> For example, if I make a typo in an identifier when introducing an
> edge, then Haskell compiler would report this as an unknown
> identifier.

But you can always use variables rather than hard-coding the Strings
in... I don't *recommend* hard-coding Strings in, I just did so in
that sample usage just so you could compare it to the sample Dot code
and notice how similar it was.

>
> Also the compiler would catch name clashes, if you accidentally give
> the same id to two different nodes.
>
> A potential disadvantage is then an increased verbosity: first, create
> the nodes, then use them for the edges. Meaning three actions instead of
> yours single one:
>
>         "a0" --> "a1"
>
> Still, even in the "automatic ids" approach, this can be written
> compactly in a single
> line in the spirit of:
>
>     bindM2 (-->) (node [textLabel "a0"]) (node [textLabel "a1"])
>
> without explicitly giving Haskell ids to the two nodes.
>
> Perhaps, this is not important stuff, because--as you write--one is
> supposed to use Haskell representations of graphs and then convert
> them with graphviz... (I might simply not want to learn another
> language for representing graphs apart from dot, that's why I'd like
> to use the monadic API: because it closely follows the known dot format.)
>
> My last line of code already looks similar to a code constructing a
> Haskell representation of a graph.
>
> I'm just writing down my comments concerning the API, not that I'm
> confident that I know a definite way to make it better.
>
> Well, after writing this post and thinking it all over while writing,
> I tend to come to a conclusion resonating with your opinion stating
> that the monadic API turned out not as useful as you used to think:
>
> it seems that while imposing the monadic style onto the programmer, it
> doesn't give the advantages a monad could give (like generating unique
> ids automatically and catching errors with undefined or clashing ids).
> Without this stateful feature, much else can be done purely with
> dedicated graph structures.
>
> What do you think about these comments?

Pretty much.  I think I had an actual use-case when I first wrote the
Monadic interface (some kind of tutorial from memory), but after I
finished it I realised it would be much simpler using the alternative
types.

If you have a data structure that already represents a graph, then
graphElemsToDot will let you convert that into the representation of a
Dot graph: http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz.html#v:graphElemsToDot

The only real reason I can come up with for using a Monadic interface
is when you want to embed a (relatively) static Dot graph into some
Haskell code and try and get some safety from the type-checker for
attribute values.  In that case, some relatively simple mapM_, etc.
expressions might come in handy.  But unless you have something rather
simple in mind, I don't think this is all that common.

> As for dotgen: my wishes could be satisfied simply with the dotgen
> package, but--as you wrote--it is not safe w.r.t. to quoting/escaping
> user supplied values.

For simple values it should be OK.

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
http://IvanMiljenovic.wordpress.com