[Haskell-cafe] is 256M RAM insufficient for a 20 million element Int/Int map?

Bertram Felgenhauer bertram.felgenhauer at googlemail.com
Sun Oct 19 08:26:29 EDT 2008


Bulat Ziganshin wrote:
> Hello Bertram,
> 
> Sunday, October 19, 2008, 6:19:31 AM, you wrote:
> 
> > That's 5 words per elements
> 
> ... that, like everything else, should be multiplied by 2-3 to
> account GC effect

True. You can control this factor though. Two RTS options help:

  -c  (Enable compaction for all major collections) - mostly
      avoids fragmentation in the old generation.
  -F<factor>
      (Control the amount of memory reserved in terms of the size
      of the oldest generation. The default is 2, meaning that if
      the oldest generation is 200MB in size, 400 MB of heap will
      be used)

Consider this program,

> module Main (main) where
> 
> import qualified Data.IntMap as M
> import Data.List (foldl')
> 
> main = do
>     loop (M.fromList [(i,0) | i <- [1..5000000]]) 1
> 
> loop dict j = do
>     i <- readLn
>     print $ dict M.! (i :: Int)
>     let dict' = foldl' (\m (k, v) -> M.insert k v m) dict [(,) i $! j*i | i <- [j`mod`10 * 500000 + 1..j`mod`10 * 500000 + 500000]]
>     loop dict' (j+1)

This program maintains an IntMap with 5 million entries, which means
200 MB of live data on a 32 bit computer. It updates the map a lot, too,
so I think this is a fairly realistic example.

Running it on a 51 line input with various RTS options [*], we get:

Options          Memory used   Time used
+RTS -c -F1.1    220 MB        3m22s
+RTS -c -F1.2    243 MB        2m12s
+RTS -c -F1.5    306 MB        1m58s
+RTS -c          398 MB        1m57s
+RTS -F 1.1      406 MB        1m43s
+RTS -F 1.2      425 MB        1m15s
+RTS -F 1.5      483 MB        1m6s
none             580 MB        1m11s

Heap residency was around 200.5 million bytes in all runs.

As expected, saving memory this way doesn't come cheap - it can
dramatically increase the program's runtime. But if a program builds
and slowly updates a large dictionary, playing with these options can
help a lot.

Bertram

[*] time (seq 50; echo 0) | ./Main +RTS -sstderr -c -F1.2


More information about the Haskell-Cafe mailing list