increasing control stack size?
Alastair Reid
reid@cs.utah.edu
Fri, 10 Nov 2000 12:28:08 -0700
> As I was getting "Control stack overflow" while
> attempting to generate large (>1MB) data files using Hugs,
> I recompiled Hugs with 10-fold increased value of the corresponding
> parameter (was 16K). Then I was able to do the necessary computations.
> However, I wonder if it was safe to so, i.e. that this change
> didn't break functionality of the Hugs system.
It should be fine.
The only risk is that, having eliminated the control-stack limit, you might run
into the C-stack limit. This tends to show itself as some kind of segmentation
fault.
> Hugs documentation sounds as if one gets "Control stack overflow" then
> it surely must be an infinite recursion; while in my case it's just
> concatenation of long (>20K) strings.
Well, that is the usual case. When it's not infinite recursion, it's usually
one of:
1) Strange operational behaviour due to laziness.
For example, this expression:
foldl (+) 0 [1..1000000]
would require constant space if strictly evaluated.
(It corresponds to a simple for loop in C.)
When evaluated lazily though, it first builds this thunk:
((((0+1)+2)+3)+...+1000000
(which takes a lot of heap space) and then evaluates it which
requires about 1000000 stack frames.
2) The quadratic concat problem.
Evaluating this:
[1] ++ ([2] ++ ([3] ++ [4]))
takes linear time in the length of the list. (As you'd expect.)
And requires constant stack space.
Evaluating this:
((([1] ++ [2]) ++ [3]) ++ [4]
takes time quadratic in the length of the list
and requires stack space which is linear in the length of the list
(I think - I haven't checked the stack space claim and laziness makes
it hard to just guess.)
The reason for the difference is that ++ takes time proportional to the
length of its first argument. The first version appends a singleton
list onto a bigger list N times. The second version appends an
increasingly long list onto a singleton list N times.
Of course, Haskell defines that ++ associates to the right so the problem
doesn't show up in single expressions. But it appears very rapidly if
you write a recursive function which constructs a list by appending
the result of recursive calls to itself. For example:
data Tree a = Leaf a | Branch (Tree a) (Tree a)
showTree :: Show a => Tree a -> String
showTree (Leaf x) = show x
showTree (Branch l r) = "(" ++ showTree l ++ "," ++ showTree r ++ ")"
run that on big enough a tree and you'll be doing an awful lot of work.
The usual fix is to avoid calling ++ with unbounded size arguments
using an extra argument which you want the result to be appended to.
That is, we define a new function "showTree'" which satisfies this
equation:
showTree' t s = showTree t ++ s
but whose definition is:
showTree' :: Show a => Tree a -> String -> String
showTree' (Leaf x) s = show x ++ r
showTree' (Branch l r) s = "(" ++ showTree' l ("," ++ showTree' r (")" ++
s)
This still calls ++ but the result of showTree' is never used as the first
argument of ++. The result runs in time proportional to the size of the
output and (probably) stack space proportional to the height of the tree.
Simon Peyton Jones' book "Implementing Functional Languages" has a
longer discussion of this in chapter 1. (Many other books discuss it too
and might be more useful to a Haskell user - but that's the one I know.)
Hope this helps,
Alastair Reid
> I guess it would've been nice if
> documentation said something more about that, and explained ways of
> increasing stack size.
>
> IMHO, some internal Hugs part uses the stack (thus, something that is
> not controllable by the command line options) for data that ought to
> be held in the heap (whose size is controllable by command line
> options)...
>
> Best regards,
> Dmitrii Pasechnik
> http://ssor.twi.tudelft.nl/~dima/
>
>
>
>
>
> _______________________________________________
> Hugs-Bugs mailing list
> Hugs-Bugs@haskell.org
> http://www.haskell.org/mailman/listinfo/hugs-bugs
>