[Haskell-cafe] Re: What do _you_ want to see in FGL?
Heinrich Apfelmus
apfelmus at quantentunnel.de
Mon May 10 09:36:05 EDT 2010
Ivan Lazar Miljenovic wrote:
> Henning Thielemann writes:
>
>> Recently I wrote cabal-sort using FGL
>> http://hackage.haskell.org/package/cabal-sort
>>
>> It sorts cabal packages topologically according to their
>> dependencies. However, I was neither happy with the way FGL currently
>> works, nor with the way I proposed recently (splitting into unlabelled
>> and labelled graphs). I like to use the package name as node identifier.
>> I do not need any label, but I need a node type different from Int.
>> With current FGL I need to maintain a Map PkgName Int. Would it be
>> sensible to generalize the Node type to any Ord type or does FGL use
>> optimizations specific to Int? In another example I used FGL for
>> finding all topological orderings of a set of database
>> transactions. In this case I used an enumeration type as node
>> type. Are there other applications for alternative Node types?
>
> We're considering doing this.
>
> Pros for allowing you to use a custom node type:
> * Matches your data better
> * No need for extra lookup maps when converting your data to FGL form
>
> Cons:
> * Makes type-sigs uglier/more verbose
> * Restricts the ability to optimise
>
> Using Int gives us a fixed-size data type with known good comparison
> performance and with a range that should suit most purposes.
I have the same gripe as Henning, though I'm not sure I concur with his
proposal.
Here a snippet from a quick & dirty 'make' implementation that I use for
building my website:
data Rule = Rule { ins :: [FilePath], out :: FilePath,
effect :: IO () }
rules2Graph :: [Rule] -> G.Gr (IO ()) ()
rules2Graph rules = G.mkGraph nodes' edges'
where
nodes = [(out r, conditionally (effect r) (ins r) (out r))
| r <- rules]
edges = [(i, out r, ()) | r <- rules, i <- ins r,
i `Map.member` nodeMap]
-- ignore source nodes
nodeMap = Map.fromList nodes
index k = Map.findIndex k nodeMap
nodes' = map (\(a,b) -> (index a, b)) nodes
edges' = map (\(a,b,c) -> (index a, index b, c)) edges
The nodes are file paths, labeled with a corresponding IO action to
create the file. The nodes are created from a list of rules that specify
how to create an output file from several input files.
As you can see, I had to use Data.Map to convert file paths into node
indexes. Ideally, the Data.Graph.Inductive.NodeMap module should be
used for that, but after some frustration, I found it completely
unsuitable for this task because it insists on using the graph label as
node identifier.
I am particularly unhappy about the definitions of nodes' and edges'
, the clever use of Map.findIndex to translate indexes while keeping
track of a label and the need for mapping the indexes myself.
I'm not sure what the right solution is, but I think it definitely
involves catering for different node types. For instance, the library
could operate on a type
newtype Graph node a b = Graph (Gr a b, Data.Map.Map Int node)
or it could offer a more useful NodeMap type and make the Node type
abstract. Some systematic and simple abstractions to manage nodes is
needed anyway.
Also, hard-coding Node = Int feels like the wrong kind of flexibility:
the user is able to do anything with it, just in case the library forgot
some important functionality. Which is exactly what happened in my case
when I used Map.findIndex . I prefer the library to get it right.
PS: While we're at it, I think newNodes should return an infinite list
of Node instead of requiring a fixed number to be specified in advance?
Regards,
Heinrich Apfelmus
--
http://apfelmus.nfshost.com
More information about the Haskell-Cafe
mailing list