[Haskell-cafe] Trying to fix space leak
Ben Challenor
ben at challenor.org
Mon Dec 31 10:20:07 EST 2007
Hi
I'm learning Haskell through writing a compiler. I'm seeing huge memory
use in a function which converts the dataflow graph to the form required
by Data.Graph. It needs to return a map from dataflow nodes to Vertexs,
a map in the other direction, and the list of edges (as Vertex pairs).
> total time = 18.78 secs (939 ticks @ 20 ms)
> total alloc = 6,616,297,296 bytes (excludes profiling overheads)
>
> COST CENTRE MODULE %time %alloc
>
> j Dataflow 87.6 99.9
> graphvizCompile Dataflow 6.4 0.0
> f Dataflow 2.9 0.0
> con2tag_DFReal# Representation 2.1 0.0
I assume the allocation is being garbage-collected pretty quickly,
because a) 6,616,297,296 bytes is stupid (!) and b) Process Explorer
informs me that the peak private bytes of the program is not more than a
couple of MB.
The offending code looks like this:
> -- (next Vertex available for use, edges so far, map so far, map so far)
> type DependencyEdgesAcc = (Vertex, [Edge], Map.Map DFNode Vertex, Map.Map Vertex DFNode)
>
> -- Process a list of DFNodes in the same context.
> dependencyEdges :: DependencyEdgesAcc -> [DFNode] -> DependencyEdgesAcc
> dependencyEdges acc [] = {-# SCC "a" #-} acc
> dependencyEdges acc@(_, _, mnv1, _) (n:ns) =
> -- Check whether this node has been done before.
> {-# SCC "j" #-} case {-# SCC "i" #-} Map.lookup n mnv1 of
> Just _ -> {-# SCC "b" #-} dependencyEdges acc ns
> Nothing ->
> -- Find the list of dependencies.
> let ndeps = {-# SCC "c" #-} nodeDependencies n in
> -- Process them first.
> let (v2, es2, mnv2, mvn2) = {-# SCC "d" #-} dependencyEdges acc ndeps in
> -- Now we can claim v2 as the label for this node n.
> -- Look up vdep for each dependency ndep and add an edge from the vdep to v2.
> let {(v4, es4, mnv4, mvn4) = {-# SCC "e" #-} List.foldl' (
> \(v3, es3, mnv3, mvn3) ndep ->
> case {-# SCC "f" #-} Map.lookup ndep mnv3 of
> Just vdep -> (v3, (vdep, v2):es3, mnv3, mvn3)
> Nothing -> error $ "this node should have been added already: " ++ show ndep
> ) (v2, es2, mnv2, mvn2) ndeps } in
> assert (v2 == v4)
> assert (mvn2 == mvn4)
> assert (mnv2 == mnv4)
> -- Finally, add this node to the accumulator and then recurse.
> dependencyEdges (v2+1, es4, {-# SCC "g" #-} Map.insert n v2 mnv2, {-# SCC "h" #-} Map.insert v2 n mvn2) ns
with the following profile:
> dependencyGraph Dataflow 4395 1 0.0 0.0 93.4 99.9
> getRootDFNodes Dataflow 4397 5 0.0 0.0 0.0 0.0
> dependencyEdges Dataflow 4396 282 0.0 0.0 93.4 99.9
> a Dataflow 4407 104 0.0 0.0 0.0 0.0
> j Dataflow 4399 178 87.6 99.9 93.4 99.9
> b Dataflow 4417 75 0.0 0.0 0.0 0.0
> g Dataflow 4405 103 0.1 0.0 0.2 0.0
> con2tag_DFBool# Representation 4419 1972 0.0 0.0 0.0 0.0
> con2tag_DFReal# Representation 4414 186890 0.1 0.0 0.1 0.0
> con2tag_DFNode# Representation 4413 1204 0.0 0.0 0.0 0.0
> h Dataflow 4404 103 0.0 0.0 0.0 0.0
> e Dataflow 4403 103 0.0 0.0 4.8 0.0
> f Dataflow 4412 174 2.9 0.0 4.8 0.0
> con2tag_DFBool# Representation 4420 245794 0.0 0.0 0.0 0.0
> con2tag_DFReal# Representation 4416 25772482 1.9 0.0 1.9 0.0
> con2tag_DFNode# Representation 4415 2068 0.0 0.0 0.0 0.0
> d Dataflow 4402 103 0.0 0.0 0.0 0.0
> c Dataflow 4401 103 0.0 0.0 0.0 0.0
> nodeDependencies Dataflow 4406 103 0.0 0.0 0.0 0.0
> i Dataflow 4400 178 0.6 0.0 0.7 0.0
> con2tag_DFBool# Representation 4418 48230 0.0 0.0 0.0 0.0
> con2tag_DFReal# Representation 4410 5045646 0.1 0.0 0.1 0.0
> con2tag_DFNode# Representation 4409 1928 0.0 0.0 0.0 0.0
Is this a space leak, or is my algorithm just silly? (It could well be
the latter...)
I've tried adding random bangs, to no effect. So, can anyone help me? :)
Cheers
Ben
More information about the Haskell-Cafe
mailing list