[Haskell-cafe] following up on space leak

Sun Jul 5 23:27:29 EDT 2009

On 7/5/09, Alexander Dunlap <alexander.dunlap at gmail.com> wrote:
> On Sun, Jul 5, 2009 at 7:46 PM, Uwe Hollerbach<uhollerbach at gmail.com>
> wrote:
>> On 7/5/09, Paul L <ninegua at gmail.com> wrote:
>>> Previously you had lastOrNil taking m [a] as input, presumably
>>> generated by mapM. So mapM is actually building an entire list before
>>> it returns the argument for you to call lastOrNil. This is where you
>>> had unexpected memory behavior.
>>>
>>> Now you are "fusing" lastOrNil and mapM together, and instead of
>>> building a list, you traverse it and perform monadic action along the
>>> way. This can happen in a constant memory if the original pure list is
>>> generated lazily.
>>>
>>> I think the real problem you had was a mis-understanding of mapM, and
>>> there was nothing wrong with your previous lastOrNil function. mapM
>>> will only return a list after all monadic actions are performed, and
>>> in doing so, it inevitably has to build the entire list along the way.
>>>
>>> --
>>> Regards,
>>> Paul Liu
>>>
>>> Yale Haskell Group
>>> http://www.haskell.org/yale
>>
>> Hi, Paul, thanks for the comments. You're quite right that I am fusing
>> the two functions together, but I think I wasn't mis-understanding
>> mapM... I knew I was generating the entire list, and aside from the
>> slight inefficiency of generating it only to tear it down an instant
>> later, that would have been no problem. But I was expecting all of the
>> memory associated with the list to be reclaimed after I had processed
>> it, and that was what was not happening as far as I could tell. (This
>> isn't one monolithic list, by the way; it's the small bodies of a
>> couple of small scheme functions that get evaluated over and over. So
>> the setup and teardown happens a lot.) I don't have very good
>> intuition yet about what should get garbage-collected and what should
>> get kept in such situations, and in fact I'm kind of in the same boat
>> again: the test case now runs much better, but it still leaks memory,
>> and I am again stumped as to why. Could I see something useful by
>> examining ghc core? I haven't looked at that yet, no idea what to look
>> for...
>>
>> Uwe
>> _______________________________________________
>
> mapM_ might be useful to you. I know there are cases where mapM leaks
> memory but mapM_ doesn't, basically because mapM_ throws away all of
> the intermediate results immediately. You might want to condition on
> nullness of the list and then mapM_ your function over the init of the
> list and then just return the function on the last element of the
> list.
>
> Alex

Oh, sorry, I was not clear in my original note in this thread: the
lastOrNil issue seems to be solved. That part of the code is, as far
as I can tell, not leaking memory at all anymore. I think I can claim
that because now the constant memory allocation is showing up visibly
in the profiling output; before, it was lost in the noise. So, if
there is a leak there, it's tiny compared with the constant stuff at
least for this benchmark. There are still two or perhaps three leaks,
and these show up as large but not huge compared to the constant
stuff. I've got a plot of this up on the haskeem website:
http://www.korgwal.com/haskeem/run_new.png.

The bits where I am stumped now are two-fold: one is (I think)
analogous to the lastOrNil issue, except that instead of feeding the
result to lastOrNil, I am doing a more general fold. So there I do
need all the results. I tried the same fusion as with lastOrNil/mapML,
and as far as I can tell I'm not building any lists; but this time it
didn't change the behavior at all, other than causing the names of
some profiling cost centers to change. This is the #2 entry on the
plot above.

The other issue seems to have something to do with IORefs, I'm
dynamically building environments for my scheme functions, and somehow
there seems to be something going wrong with reclaiming that memory
after it's done. This is the #1 and I think #3 entry on the plot. I
don't know enough details there yet to be able to say any more.

Uwe