[Haskell-cafe] How on Earth Do You Reason about Space?

Ketil Malde ketil at malde.org
Wed Jun 1 16:07:55 CEST 2011


Aleksandar Dimitrov <aleks.dimitrov at googlemail.com> writes:

> Now, here's some observations: on a 75M input file (minuscule, compared to what
> I actually need) this program will eat 30M of heap space (says profiling) and
> return in 14 secs.
>
> I have two problems with that: a) that's too much heap space, b) the actual memory
> residency is *much* worse.

30M isn't a lot these days.  How does it scale?

> ad b) average memory residency is at 38MB (this is OK, given heap consumption)
> but max residency is at 130MB, which is unacceptable to me (remember that I need
> to run this on files *much* bigger than just 75M.)

I think max residency (depending on how you measure) can be twice the
heap size due to using copying GC.  If you run short of memory, the
runtime will switch to compacting GC which will be slower but use less
memory. 

> I have tried and tried again to avoid writing programs in Haskell that would
> leak space like BP likes to leak oil.

I know the feeling.  I think making a frequency table ought to be as
simple as 

   ... M.fromListWith (+) . map (,1) . words =<< readFile ...

or at worst

  ... foldl' (\m w -> M.insertWith' (+) w 1 m) M.empty . words =<< readFile...

(which evaluates things strictly, and at least in my small tests, seem
to use quite a bit less heap space).

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants



More information about the Haskell-Cafe mailing list