[Haskell-cafe] why does the binary library require so much memory?

Jeremy Shaw jeremy at n-heptane.com
Fri Jul 31 16:56:05 EDT 2009


Hello,

Using encode/decode from Binary seems to permamently increase my
memory consumption by 60x fold. I am wonder if I am doing something
wrong, or if this is an issue with Binary.

If I run the following program, it uses sensible amounts of memory
(1MB) (note that the bin and list' thunks won't actully be evaluated):

import Data.Binary

main :: IO ()
main =
    let list = [1..1000000] :: [Int]
        bin   = encode list
        list' = decode bin :: [Int]
    in putStrLn (show . length $ takeWhile (< 10000000) list) >> getLine >> return ()

/tmp $ ghc --make -O2 Bin.hs -o bin
/tmp $ ./bin +RTS -s
/tmp/bin +RTS -s 
1000000

      68,308,156 bytes allocated in the heap
           6,700 bytes copied during GC
          18,032 bytes maximum residency (1 sample(s))
          22,476 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:   130 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.05s  (  0.92s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.05s  (  0.92s elapsed)

  %GC time       0.0%  (0.1% elapsed)

  Alloc rate    1,313,542,603 bytes per MUT second

  Productivity 100.0% of total user, 5.7% of total elapsed

According to top:

   VIRT RSS   SHR
   3880 1548  804


Now, if I change *list* in the last line to *list'* so that the
encode/decode stuff actually happens:

/tmp $ ./bin +RTS -s
/tmp/bin +RTS -s 
1000000

     617,573,932 bytes allocated in the heap
     262,281,412 bytes copied during GC
      20,035,672 bytes maximum residency (10 sample(s))
       2,187,296 bytes maximum slop
              63 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  1151 collections,     0 parallel,  0.47s,  0.48s elapsed
  Generation 1:    10 collections,     0 parallel,  0.36s,  0.40s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.47s  ( 20.32s elapsed)
  GC    time    0.84s  (  0.88s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    1.30s  ( 21.19s elapsed)

  %GC time      64.1%  (4.1% elapsed)

  Alloc rate    1,319,520,653 bytes per MUT second

  Productivity  35.9% of total user, 2.2% of total elapsed

And top reports:

   VIRT   RSS  SHR
   67368  64m  896

63 times as much total memory in use. And, this is while the program
is waiting around at 'getLine' after it is 'done' with the data.

I am using GHC 6.10.4 on GNU/Linux.

Thanks!
- jeremy


More information about the Haskell-Cafe mailing list