[Haskell-cafe] Re: XML (HXML) parsing :: GHC 6.8.3 space leak from 2000

Lev Walkin vlm at lionet.info
Thu Sep 18 12:46:09 EDT 2008


Simon Marlow wrote:
> Lev Walkin wrote:
> 
>> I wondered why would a contemporary GHC 6.8.3 exhibit such a leak?
>> After all, the technique was known in 2000 (and afir by Wadler in '87)
>> and one would assume Joe English's reference to "most other Haskell
>> systems" ought to mean GHC.
> 
> Thanks for this nice example - Don Stewart pointed me to it, and  Simon 
> PJ and I just spent some time this morning diagnosing it.
> 
> Incedentally, with GHC 6.8 you can just run the program with "+RTS -hT" 
> to get a basic space profile, there's no need to compile it for 
> profiling - this is tremendously useful for quick profiling jobs.  And 
> in this case we see the the heap is filling up with (:) and Tree 
> constructors, no thunks.
> 
> Here's the short story:  GHC does have the space leak optimisation you 
> refer to, and it is working correctly, but it doesn't cover all the 
> cases you might want it to cover.  In particular, optimisations 
> sometimes interact badly with the space leak avoidance, and that's what 
> is happening here.  We've known about the problem for some time, but 
> this is the first time I've seen a nice small example that demonstrates it.
> 
>>     -- Lazily build a tree out of a sequence of tree-building events
>>     build :: [TreeEvent] -> ([UnconsumedEvent], [Tree String])
>>     build (Start str : es) =
>>             let (es', subnodes) = build es
>>                 (spill, siblings) = build es'
>>             in (spill, (Tree str subnodes : siblings))
>>     build (Leaf str : es) =
>>             let (spill, siblings) = build es
>>             in (spill, Tree str [] : siblings)
>>     build (Stop : es) = (es, [])
>>     build [] = ([], [])

[skip]

> We don't know of a good way to fix this problem.  I'm going to record 
> this example in a ticket for future reference, though.

Simon,

is there a way, perhaps, to rewrite this expression to avoid leaks?
An ad-hoc will do, perhaps split in two modules to avoid intramodular
optimizations?

-- 
Lev Walkin
vlm at lionet.info


More information about the Haskell-Cafe mailing list