[Haskell-cafe] need help on building a huge graph

Daniel Fischer daniel.is.fischer at googlemail.com
Wed May 11 11:51:58 CEST 2011


On Wednesday 11 May 2011 09:25:05, Bin Jin wrote:
> sorry, it's 15 seconds. It's a typo

It's less for me, something like 5.8s compiled without optimisations, 3.6s 
with -O2 (Pentium4, 3.06GHz).
The overwhelming part of that is spent in GC:

  MUT   time    0.69s  (  0.73s elapsed)
  GC    time    5.07s  (  5.10s elapsed)

resp.

  MUT   time    0.20s  (  0.20s elapsed)
  GC    time    3.36s  (  3.38s elapsed)

From now on, everything is compiled with -O2.
Giving type signatures that force things to Int (without any type 
signatures beyond the one given, you have

graph :: Array Integer [(Integer,Int)]
edges :: [(Integer,(Integer,Int))]
etc.

), that goes down to

  MUT   time    0.12s  (  0.11s elapsed)
  GC    time    3.10s  (  3.10s elapsed)

A nice relative improvement for the productive work, but since that is 
almost negligible compared to GC anyway, it doesn't make a big difference 
altogether.

You can shave a large slice off the GC if you construct the second 
component pairs in edges directly, at the moment you construct one (x,y) 
pair, then deconstruct it to build a (y,1-x) pair; building the latter 
directly reduces allocation and GC significantly:

     137,930,992 bytes allocated in the heap
     429,681,256 bytes copied during GC
  MUT   time    0.07s  (  0.07s elapsed)
  GC    time    1.20s  (  1.20s elapsed)

Then using a stricter accumulation function,

upd :: [(Int,Int)] -> (Int,Int) -> [(Int,Int)]
upd xs p@(x,y) = x `seq` y `seq` (p:xs)

instead of (flip (:)) forces some more thunks and further reduces 
allocation and GC (but not very much):

      98,845,888 bytes allocated in the heap
     350,740,560 bytes copied during GC
  MUT   time    0.07s  (  0.07s elapsed)
  GC    time    1.06s  (  1.06s elapsed)

Much better than before, but still way too much GC. Reducing GC by 
specifying a larger allocation area helps:

-A32M:
  MUT   time    0.10s  (  0.10s elapsed)
  GC    time    0.40s  (  0.40s elapsed)

-A64M:
  MUT   time    0.14s  (  0.14s elapsed)
  GC    time    0.29s  (  0.29s elapsed)

-A128M:
  MUT   time    0.17s  (  0.17s elapsed)
  GC    time    0.00s  (  0.00s elapsed)

> 
> On Wed, May 11, 2011 at 3:20 PM, Bin Jin <bjin1990 at gmail.com> wrote:
> > Hi, cafe
> > 
> > I'm recently solving some algorithm puzzles from previous Google Code
> > Jam rounds, and today I met some problem I can't solve now.
> > It's a problem (Round 3 2010, B) that the solution require to find a
> > shortest path in a graph with about 100,000 vertices
> > and 50 edges from each vertex.  Apart from find shortest path, I find
> > that building graph also takes a lot of time.
> > Yes, I know it's unnecessary to dump the entire graph to the memory,
> > but I want to know if it's possible to build such graph in some
> > reasonable time.
> > 
> > following is a snippet of code building the graph, runs about 15 mins
> > on Core 2 Duo 2.40GHz without any RTS flags
> > 
> >> > import Data.Array
> >> > 
> >> > main = graph `seq` putStrLn "Finished"
> >> > 
> >> > n = 100000
> >> > bnds = (0, n-1)
> >> > 
> >> > buildGraph bnds edges = accumArray (flip (:)) [] bnds edges
> >> > 
> >> > graph = buildGraph bnds edges
> >> > 
> >> > edges = [ (i, (y, 1 - x::Int))
> >> > 
> >> >         | i <- range bnds
> >> >         
> >> >         , j <- [2..50]
> >> >         , let (x, y) = if i + j >= n then (1, i + j - n) else (0, i
> >> >         + j) ]
> > 
> > Regards,
> > Bin Jin



More information about the Haskell-Cafe mailing list