[Haskell-cafe] help optimizing memory usage for a program

Kenneth Hoste kenneth.hoste at ugent.be
Mon Mar 2 15:14:27 EST 2009

On Mar 2, 2009, at 19:13 , Manlio Perillo wrote:

> Manlio Perillo ha scritto:
>> [...]
>> > moreover, you may set up"growing factor". with a g.f. of
>>> 1.5, for example, memory will be collected once heap will become  
>>> 1.5x
>>> larger than real memory usage after last GC. this effectively
>>> guarantees that memory overhead will never be over this factor
>> Thanks.
>> This seems to be effective (but it also reduce performances).
>> 3) With -F1 option
>> [...] I have to parse the whole data set, to check if memory usage  
>> is good.
> Ok, done:
>  real	49m7.369s
>  user	45m21.642s
>  sys	0m18.893s
>  814 MB used
> This is better memory usage, respect to:
>  real	7m17.853s
>  user	3m38.506s
>  sys	0m7.612s
>  1586 MB used
> However, Kenneth Hoste reported (http://boegel.kejo.be/):
>  26 minutes, with 700 MB used.
> Maybe he was using the latest GHC version.
> I would also like to check how performances are with other  
> functional languages.

The 26m/700MB I mentioned on my blog was on my ancient PowerBook G4  
(1.5GHz PowerPC G4, 1.25G).

I redid the same experiment on our iMac (Core2 Duo, 2.0 GHz, 3.0G),  
- read in all the data
- count the number of keys in the IntMap (which should be 17,770, i.e.  
the number of movies)
- compute the mean overall movie rating (which should be 3.6033)

That task was done in 3m42s, using just 632M of memory, using the  
following command:

./netflix ./training_set/ +RTS -A128M -s

The -A option makes sure GC isn't cleaning up stuff too frequently,  
while -s just reports some statistics.

The way in which I'm reading in the data is somewhat different from  
I construct the IntMap from the ground up, i.e. starting with an empty  
IntMap and using foldM,
together with a function readMovie with the following type:

readMovie :: IntMap (UArray Int Int, UArray Int Word8) -> FilePath ->  
IO (IntMap (UArray Int Int, UArray Int Word8))

In readMovie, I'm using the 'insert' function provided by the IntMap  
module, which justs insert a new key-value pair
in the existing IntMap.

Your approach is very different: you create 17,770 IntMaps with a  
single key/value pair in them,
and then use union to combine them all.

I profiled your approach on the same iMac (using the same +RTS options),
and it needed 4m33s to run, using 948M of memory.

I think my approach is turning out better because I'm:

- building up the IntMap using 'empty' and 'insert', instead of  
combining 17,770 'singleton' IntMaps
   (which probably results better GC behavior)
- using UArray instead of Urr (although I don't know if that actually  
makes a difference here)

I hope this helps you with figuring out what the bottleneck is on your  
It took me several days to come up with this approach, with the help  
from various Haskellers at IRC,
so I'm surely no expert...

I've thrown my current code online at http://boegel.kejo.be/files/Netflix_read-and-parse_24-02-2009.hs 
let me know if it's helpful in any way...

Also, I was indeed using GHC 6.10.1, although I'm unsure to what  
extent that matter.




Kenneth Hoste
Paris research group - ELIS - Ghent University, Belgium
email: kenneth.hoste at elis.ugent.be
website: http://www.elis.ugent.be/~kehoste
blog: http://boegel.kejo.be

More information about the Haskell-Cafe mailing list