[Haskell-cafe] is 256M RAM insufficient for a 20 million
element Int/Int map?
Bertram Felgenhauer
bertram.felgenhauer at googlemail.com
Sun Oct 19 08:26:29 EDT 2008
Bulat Ziganshin wrote:
> Hello Bertram,
>
> Sunday, October 19, 2008, 6:19:31 AM, you wrote:
>
> > That's 5 words per elements
>
> ... that, like everything else, should be multiplied by 2-3 to
> account GC effect
True. You can control this factor though. Two RTS options help:
-c (Enable compaction for all major collections) - mostly
avoids fragmentation in the old generation.
-F<factor>
(Control the amount of memory reserved in terms of the size
of the oldest generation. The default is 2, meaning that if
the oldest generation is 200MB in size, 400 MB of heap will
be used)
Consider this program,
> module Main (main) where
>
> import qualified Data.IntMap as M
> import Data.List (foldl')
>
> main = do
> loop (M.fromList [(i,0) | i <- [1..5000000]]) 1
>
> loop dict j = do
> i <- readLn
> print $ dict M.! (i :: Int)
> let dict' = foldl' (\m (k, v) -> M.insert k v m) dict [(,) i $! j*i | i <- [j`mod`10 * 500000 + 1..j`mod`10 * 500000 + 500000]]
> loop dict' (j+1)
This program maintains an IntMap with 5 million entries, which means
200 MB of live data on a 32 bit computer. It updates the map a lot, too,
so I think this is a fairly realistic example.
Running it on a 51 line input with various RTS options [*], we get:
Options Memory used Time used
+RTS -c -F1.1 220 MB 3m22s
+RTS -c -F1.2 243 MB 2m12s
+RTS -c -F1.5 306 MB 1m58s
+RTS -c 398 MB 1m57s
+RTS -F 1.1 406 MB 1m43s
+RTS -F 1.2 425 MB 1m15s
+RTS -F 1.5 483 MB 1m6s
none 580 MB 1m11s
Heap residency was around 200.5 million bytes in all runs.
As expected, saving memory this way doesn't come cheap - it can
dramatically increase the program's runtime. But if a program builds
and slowly updates a large dictionary, playing with these options can
help a lot.
Bertram
[*] time (seq 50; echo 0) | ./Main +RTS -sstderr -c -F1.2
More information about the Haskell-Cafe
mailing list