[Haskell-cafe] trying to understand space leaks....

Wed May 27 18:12:25 EDT 2009

There's still the space used by the closure "b".

An example:

expensiveParser :: Parser Char ExpensiveStructure

simple :: Parser Char Int

withExpensive :: ExpensiveStructure -> Parser Char Int
withExpensive _ = mzero  -- actually always fails, not using its argument.

example = do
    e <- expensiveParser
    simple `mplus` withExpensive e

The expensive structure constructed by expensiveParser needs to be
kept in memory throughout the entire parsing of "simple", even though
withExpensive doesn't actually use it and would immediately fail.  A
smarter parser could realize that e couldn't actually ever be used and
allow the GC to free it much more quickly.

This example can be made arbitrarily more complicated; withExpensive
could run different things based on the value of "e" that could be
determined to fail quickly, simple might actually do a lot of work,
etc.  But during the "mplus" in the monadic parser, we can't free e.

  -- ryan

On Wed, May 27, 2009 at 12:49 PM, Daryoush Mehrtash <dmehrtash at gmail.com> wrote:
> So long as the [s] is a fixed list (say [1,2,3,4]) there is no space
> leak.    My understanding was that the space leak only happens if there is
> computation involved in building the list of s.      Am I correct?
>
> If so, I still don't have any feeling for what needs to be saved on the heap
> to be able to back track on computation that needs and  IO computation
> data.    What would be approximate  space that an IO (Char) computation
> take  on the heap, is it few bytes, 100, 1k,  ....?
>
> Daryoush
>
>
> On Tue, May 26, 2009 at 6:11 PM, Ryan Ingram <ryani.spam at gmail.com> wrote:
>>
>> On Tue, May 26, 2009 at 5:03 PM, Daryoush Mehrtash <dmehrtash at gmail.com>
>> wrote:
>> > newtype Parser s a = P( [s] -> Maybe (a, [s]))
>> (fixed typo)
>>
>> > instance MonadPlus  Parser where
>> >       P a mplus P b = P (\s -> case a s of
>> >                             Just (x, s') -> Just (x, s')
>> >                             Nothing -> b s)
>>
>> > a)what exactly gets saved on the heap between the mplus calls?
>>
>> Two things:
>>
>> (1) Values in the input stream that "a" parses before failing.
>> Beforehand, it might just be a thunk that generates the list lazily in
>> some fashion.
>>
>> (2) The state of the closure "b"; if parser "a" fails, we need to be
>> able to run "b"; that could use an arbitrary amount of space depending
>> on what data it keeps alive.
>>
>> > b)I am assuming the computation to get the next character for parsing to
>> > be
>> > an "IO Char" type computation,  in that case, what would be the size of
>> > the
>> > heap buffer that is kept around in case the computation result needs to
>> > be
>> > reused?
>>
>> Nope, no IO involved; just look at the types:
>>
>> P :: ([s] -> Maybe (a,[s])) -> Parser s a
>>
>> (Parser s a) is just a function that takes a list of "s", and possibly
>> returns a value of type "a" and another list [s] (of the remaining
>> tokens, one hopes)
>>
>> It's up to the caller of the parsing function to provide the token
>> stream [s] somehow.
>>
>> > c) Assuming Pa in the above code reads n tokens from the input stream
>> > then
>> > fails, how does the run time returns the same token to the P b?
>>
>> It just passes the same stream to both.  No mutability means no danger :)
>>
>>  -- ryan
>
>
>