[GHC] #8189: Default to infinite stack size?
GHC
ghc-devs at haskell.org
Wed Aug 28 00:52:38 UTC 2013
#8189: Default to infinite stack size?
------------------------------------+-------------------------------------
Reporter: nh2 | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Keywords: | Operating System: Unknown/Multiple
Architecture: Unknown/Multiple | Type of failure: Runtime crash
Difficulty: Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: |
------------------------------------+-------------------------------------
http://www.haskell.org/pipermail/haskell-cafe/2013-August/108570.html
It looks like any code of the form
{{{
... = do
x1 <- m1
x2 <- m2
[do something with x1 and m2]
}}}
will stack overflow when used with a strict monad (like IO) and
sufficiently large data to deal with (a 1m element list is enough to
exceed the default 8MB stack size limit).
Other examples include:
{{{
sequence (m:ms) = do x <- m
xs <- sequence ms
return (x:xs)
}}}
and
{{{
liftM2, liftM3, ...
}}}
By slightly modifying your program to do the same thing via heap
allocation (e.g. using a difference list in sequence), a stack-overflow
rejected program turns into a valid program.
This means that valid and idiomatic code like
{{{
xs <- replicateM n someIOAction
xs <- mapM someIOAction [1..n]
}}}
will eventually crash, which has an impact on practically all Haskell
software and libraries out there.
For examples, see
https://github.com/ndmitchell/shake/blob/e0e0a43/Development/Shake/Database.hs#L394
https://github.com/haskell-suite/haskell-src-exts/issues/27
as well as the mapM usages in cabal, haddock, any database binding that
allows querying a lists of results, etc.
While some argue that ''"you shouldn't use lists of that size"'',
''"allocating a 150 MB list is wrong"'' or that they ''"carefully avoid
any use of sequence/mapM due to the problem"'' pointed out above, I think
that collecting a million results of a strict monadic action into a linked
list using standard functions like mapM is a common pattern and valid use
case. The stack based evaluation that GHC performs is an optimization and
should be transparent to the user; it should not re-define some data to
live in a fixed-size memory region, and it should work just as well as it
does in GHCi.
It seems that a default infinite stack size would mitigate this.
A drawback might be that the stack space overflows we get so far to
quickly spot a "wrong" program would not indicate certain errors any more.
I currently doubt the usefulness of this debugging method, since it does
not work reliably in the presence of optimization, doing the same thing
via heap allocation makes the programs "correct", and it rejects valid
programs if the input data is large enough.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8189>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list