[Haskell-cafe] Space usage problems

Daniel Fischer daniel.is.fischer at web.de
Tue Jan 10 16:41:46 EST 2006


Am Dienstag, 10. Januar 2006 17:44 schrieb Ian Lynagh:
> Hi all,
>
> I am having space issues with some decompression code; I've attached a
> much simplified version as Test1.hs.
>
> At the bottom (foo/bar) is the equivalent of deflate. This should be a
> standalone module which doesn't know about the rest.
>
> In the middle (readChunks) is the equivalent of gunzip. It repeatedly
> calls foo until there is no more input left.
>
> At the top is a simple main function that calls them.
>
> If I do
>
>     dd if=/dev/zero of=data bs=1000 count=3000" # making data around 3MB
>     ghc --make Test1 -o Test1 -O -Wall
>     ./Test1
>
> then in top I see Test1 increasing memory usage to around 150MB. I think
> this is because the "let (ys, zs) = foo xs" means zs holds on to xs
> (it's hard to be sure as compiling for profiling is too happy to change
> the behaviour).

I had 72 Mb space usage for a 3Mb file.

I believe, it's the 'put zs' that's consuming the memory. I changed it to

readChunks = do xs <- get
                if null xs then return []
                           else do let (ys, zs) = foo xs
                                       rest = evalState readChunks zs
                                   return (ys ++ rest)

and got much smaller memory usage (10Mb) -- not sure, how sensible that would 
be for real work and why it reduces memory. If bar can start returning before 
it's finished, then the same holds for the modified readChunks, but the 
original would have to wait for the completion of bar (via foo) until zs can 
be put, so the complete ys would have to be in memory at once.

Just checked, modified version also runs in 10Mb for a 12mb data file, 
so indeed bar starts returning before finishing and it seems the above is 
right.
./test4 +RTS -sstderr
True
1,496,184,404 bytes allocated in the heap
987,852,924 bytes copied during GC
  3,226,492 bytes maximum residency (162 sample(s))

       5707 collections in generation 0 ( 12.66s)
        162 collections in generation 1 ( 14.82s)

         10 Mb total memory in use

  INIT  time    0.00s  (  0.01s elapsed)
  MUT   time    6.88s  ( 15.06s elapsed)
  GC    time   27.48s  ( 55.93s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   34.36s  ( 71.00s elapsed)

  %GC time      80.0%  (78.8% elapsed)

  Alloc rate    217,468,663 bytes per MUT second

  Productivity  20.0% of total user, 9.7% of total elapsed


>
> I tried (Test2) changing foo to be a monad transformer over the calling
> monad, so the caller's remaining input was updated as we went along, but
> (as well as memory usage not obviously being fixed) this is giving me a
> stack overflow.
>
>
> Has anyone got any suggestions for making a constant space, constant
> stack version?
>
>
> Thanks
> Ian

Hope that helps,
Daniel



More information about the Haskell-Cafe mailing list