[Haskell-cafe] Haskell maximum stack depth

Wed Feb 6 03:20:18 EST 2008

Neil Mitchell wrote:
> Hi
> 
>> If you mean an example of coding style and choice of stack vs. heap,
>> I already have..
>>
>>   http://www.haskell.org/pipermail/haskell-cafe/2008-January/038832.html
> 
> I'm at a loss as why you want a strict version of take. It's clearly
> not for performance, as it performs slower. I'd say both the gobbler
> programs have a bug, namely that they are not sufficiently lazy.

I have already partially scanned the list looking for the first
element that satisfies some condition, using a tail recursive search.

If such an element is found I want to split the list at that point.

If such an element is not found the entire list has been scanned without
using any "extra" stack or heap (other than that used by the list itself
and the condition test).

I could build the reversed list accumulator on the heap as I did the
search, but I don't because this will be completely wasted effort in the
case where such an element is not found. So instead I just use an
unboxed Int to count how far I get and have the search return this
and the unsearched suffix (in the case where a matching element is
found).

But the lifetimes of the list prefix and suffix from this point on are
completely unrelated so I don't want the prefix thunk to be hanging on
to the unknown sized suffix. As I already know that the list has been
evaluated at least up to the point that it gets "chopped off", I choose
to use a strict (eager) take.

> As an aside, my version of this function would be:
> 
> neilGobbler :: Int -> [x] -> [x]
> neilGobbler n xs = length res `seq` res
>     where res = take n xs
> 
> I have no idea if it takes the heap or the stack, or if it performs
> faster or slower. If you still have whatever test you used on the
> gobbler, perhaps you could tell us.

My guess is it will use O(1) stack and burn O(n) heap (in addition that
actually used by the result), so assymptotic complexity wise same as
heapGobbler, but probably higher constant factors with ghc due to lazy
building of take thunks and subsequent reduction and indirection costs.

>> If you mean an example of it biting with lazy code, this is discussed
>> so often you'd be spoiled for choice if you search the mailing list
>> archives. Here's a good one..
>>
>>   http://www.haskell.org/pipermail/haskell-cafe/2005-December/013521.html
>>
>> This example actually shows the problem twice. In one case it's solvable
>> by using foldl' instead of foldl.
> 
> Which reduces the memory from O(n) to O(1).

Are you sure about that? Using foldl' here eliminates one of the two
possible sources of stack overflow, but it doesn't eliminate a space
leak. It's O(n) either way. Using strict Map insertion will eliminate
a space leak (in this case) and also a possible source stack overflow.

> Surely thats a good thing,

Would be if it was true :-)

> and the code before had a space leak. Space leak is bad, therefore
> telling people about it is good.

There are plenty of space leaks that won't cause stack overflows, and
plenty of stack overflows that aren't caused by space leaks (see above
for one example).

Again I have to say that even if true, I think this is a pretty lame
justification for the current implementation. The *default* behaviour of
any useful program should surely be to make best effort to carry on
working (and in due course deliver an answer or whatever), even if
there is unexpectedly high stack use for some reason (that may or may
not be a "bug").

> I think its sensible to let people set their own stack bound (which is
> possible),

I have no objection to people bounding their stack if that's their
choice. I can't imagine why anybody who stopped to think about this
would actually want this feature, but it's free world.

What I object to is it being bounded by default to something other
than overall program memory limit. I know that I could probably
achieve this effect myself with +RTS options, but I also want to be
able to write libraries that other people are going to use safely
without having to add a appropriate warning in the documentation
to the effect that some parts use O(n) stack space deliberately, by
design.

But of course this all assumes that underlying implementation is
sufficiently robust to make "unbounded" stacks safe (at least as safe as
any other "unbounded" data structure). Unfortunately it seems this isn't
the case at present if what various folk have told me is true.

> but that clearly just from taking an informal poll of
> respondants to this thread, the current default should indeed be the
> default. You seem to be the only person clamouring for an unlimited
> stack,

Yes, this is strange. Same thing happened in the "global variables"
debate despite it being obvious to any thinking person that I was
correct. Denial of the reality of some very simple examples of the
problem was typical of that debate too.

:-)

Regards
--
Adrian Hey