[Haskell-cafe] Graph theory analysis of Haskell code

Thu Dec 6 18:31:00 EST 2007

On 07/12/2007, Tommy McGuire <mcguire at crsr.net> wrote:
> > How I envisage it happening is that a parser would be used to find all
> > "functions" in the given code, treat these as nodes in the graph and
> > then use directed edges to indicate which functions call other
> > functions.  This resultant graph can then be analysed in various ways
> > suitable to the context (e.g. find that a library module can be split
> > into two since there are two completely separate trees present in the
> > graph that don't interact at all, or if a function is only ever called
> > by one other function then it can be subsumed into it).
>
> It just occurred to me that this idea is more general than the control
> or data flow analysis that are the focus of similar ideas I've seen
> before.  For example, you could trace type usage through the code (which
> would likely be a subset of the control flow for Haskell, but an
> interesting subset nonetheless).  :-)
>

While I'd like to do this, for the purposes of ensuring that the
project contains enough mathematics (and to keep my supervisor happy,
since he is skeptical about why I keep bringing Haskell into
everything, even though I've re-implemented in Haskell a program that
he wrote in C which generated more/better answers, and we _still_
can't find where the bugs in his code are, but that's another
story...), I'm not sure if I can delve too deeply into
Haskell-specifics.  Of course, if I can somehow turn the type usage
into nodes on the graph, then that should be all right! :D

On 06/12/2007, Ketil Malde <ketil+haskell at ii.uib.no> wrote:
> I'll just add that having a tool visualizing this would be very useful
> for refactoring code.  If you e.g. use color to distinguish
> nodes/functions from different modules, you could use that information
> to decide to merge or split modules to minimize external interfaces.
>
> You could also try to automatically cluster nodes, which would be more
> interesting theoretically, but IMO less likely to be practical.
>
> Another option would be to couple this with profiling or coverage
> information to visualize something about the usage of paths and nodes
> in the call graph.

This is partially what I was hoping to do.  I do know that my
supervisor was interested in examining C code and attaching cost
information to the nodes (using some Unix profiling tool which he
couldn't remember the name of), but I'm not sure how I'd go about
adding compilation and profiling into such a program.

-- 
Ivan Lazar Miljenovic