[Haskell-cafe] Please help me spot where space leak occur.

Jason Dagit dagitj at gmail.com
Thu Jul 21 14:14:45 CEST 2011


On Thu, Jul 21, 2011 at 1:22 AM, Olexander Kozlov <ookozlov at gmail.com> wrote:
> Greetings Haskell community,
> I stuck with a space leak problem in simple task. I have a function
> described with a list of pairs:
>    type Fn a b = [(a, b)]
> mapping values from a to b. And I have a composition operation which accepts
> two functions and returns
> theirs composition:
>    compose :: Eq b => Fn a b -> Fn b c -> Fn a c
> Finding composition of 1,000,000 functions takes about 240M with the
> following implementation:
> type Fn a b = [(a, b)]
> lkup :: Eq a => a -> Fn a b -> b
> lkup x ((a, b) : abs) = if x /= a then lkup x abs else b
> compose :: Eq b => Fn a b -> Fn b c -> Fn a c
> compose ((a, b) : abs) bcs = (a, lkup b bcs) : compose abs bcs
> compose [] _ = []
> fldl :: (a -> b -> a) -> a -> [b] -> a
> fldl f z (x:xs) = fldl f (f z x) xs
> fldl _ z [] = z
> fna = [(1, 2), (2, 3), (3, 1)]
> fni = [(1, 1), (2, 2), (3, 3)]
> test = fldl compose fni (replicate 1000000 fna)
> main = print test
> I build with:
>    ghc --make -prof -auto-all -caf-all -fforce-recomp -rtsopts Test.hs
> Profile results:
>   Test.exe +RTS -K50M -p -RTS
> total time  =        0.34 secs   (17 ticks @ 20 ms)
> total alloc = 240,003,820 bytes  (excludes profiling overheads)
> COST CENTRE                    MODULE               %time %alloc
> compose                        Main                  58.8   80.0
> lkup                           Main                  35.3    0.0
> test                           Main                   5.9   11.7
> fldl                           Main                   0.0    8.3
> After reading 'High-Performance Haskell' by Johan Tibell I tried to made my
> compose function strict:
> compose :: Eq b => Fn a b -> Fn b c -> Fn a c
> compose ((a, b) : abs) bcs = let c = lkup b bcs in c `seq` (a, c) : compose
> abs bcs
> compose [] _ = []
> and memory consuption reduced down to 180M. Good achievement but still too
> much!
> Profile results:
>   Test.exe +RTS -K50M -p -RTS
> total time  =        0.24 secs   (12 ticks @ 20 ms)
> total alloc = 180,003,820 bytes  (excludes profiling overheads)
> COST CENTRE                    MODULE               %time %alloc
> compose                        Main                  83.3   73.3
> lkup                           Main                   8.3    0.0
> fldl                           Main                   8.3   11.1
> test                           Main                   0.0   15.6
> I tried to make fldl strict as well:
> fldl :: (a -> b -> a) -> a -> [b] -> a
> fldl f z (x:xs) = let z' = f z x in z' `seq` fldl f z' xs
> fldl _ z [] = z
> and end up with 160M of total allocations.
> Profile results:
>   Test.exe +RTS -K32M -p -RTS
> total time  =        0.30 secs   (15 ticks @ 20 ms)
> total alloc = 160,003,820 bytes  (excludes profiling overheads)
> COST CENTRE                    MODULE               %time %alloc
> compose                        Main                  46.7   82.5
> lkup                           Main                  40.0    0.0
> test                           Main                   6.7   17.5
> fldl                           Main                   6.7    0.0
> Please help me spot where space leak occur, I see no other places for space
> leaks. I'll appreciate any help.

The space leak is because your fldl is not as strict as is needed.
This version of the program needs about 1 meg on my computer:
{-# LANGUAGE BangPatterns #-}
type Fn a b = [(a, b)]

lkup :: Eq a => a -> Fn a b -> b
lkup x ((a, b) : abs) = if x /= a then lkup x abs else b

compose :: Eq b => Fn a b -> Fn b c -> Fn a c

compose ((a, b) : abs) bcs = (a, lkup b bcs) : compose abs bcs
compose [] _ = []

-- fldl :: (a -> b -> a) -> a -> [b] -> a
fldl f z (x:xs) =
  let z'@[(!z1,!z2),(!z3,!z4),(!z5,!z6)] = f z x
  in z' `seq` fldl f z' xs
fldl _ z [] = z

fna = [(1, 2), (2, 3), (3, 1)]
fni = [(1, 1), (2, 2), (3, 3)]

test = fldl compose fni (replicate 1000000 fna)

main = print test


$ Test +RTS -K50M -p -s -RTS
[(1,2),(2,3),(3,1)]
     351,125,156 bytes allocated in the heap
         108,576 bytes copied during GC
          26,888 bytes maximum residency (1 sample(s))
          18,168 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:   669 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.01s elapsed)
  MUT   time    0.50s  (  0.50s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  RP    time    0.00s  (  0.00s elapsed)
  PROF  time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.50s  (  0.51s elapsed)

  %GC time       0.0%  (0.5% elapsed)

  Alloc rate    703,371,486 bytes per MUT second

  Productivity 100.0% of total user, 98.4% of total elapsed

When checking for a space leak, I find the output of +RTS -s -RTS more
helpful than the allocation stats in the .prof:

$ cat Test.prof
        Thu Jul 21 05:12 2011 Time and Allocation Profiling Report  (Final)

           Test.exe +RTS -K50M -p -s -RTS

        total time  =        0.48 secs   (24 ticks @ 20 ms)
        total alloc = 228,003,836 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE               %time %alloc

fldl                           Main                  37.5    3.5
compose                        Main                  29.2   84.2
lkup                           Main                  25.0    0.0
test                           Main                   8.3   12.3

If I went by just the .prof I might think the program took 228MB of
ram while it was running, but it only needed about 1MB because it kept
reusing that space.

I hope that helps,
Jason



More information about the Haskell-Cafe mailing list