panic when compiling SHA

Ben Lippmeier benl at ouroborus.net
Wed Jan 8 04:30:05 UTC 2014


On 08/01/2014, at 10:57 , Adam Wick <awick at galois.com> wrote:

> I also just performed an experiment in which I took the SHA library, deleted all instances of INLINE and SPECIALIZE, and compiled it with HEAD on 32-bit Linux. You get the same crash. So my usage of SPECIALIZE and INLINE is beside the point. 

Ok, then maybe the default inliner heuristics are a bit too eager for this program. Whether that's a bug is open for debate. The standard way of setting such heuristics is to compile a "representative" set of benchmarks (eg, nofib) and choose some settings that give good average performance. I don't think this is an ideal approach, but it's the typical one for compiler engineering.


>> Sadly, "valid input" isn't a well defined concept in practice. You could write a "valid" 10GB Haskell source file that obeyed the Haskell standard grammar, but I wouldn't expect that to compile either.
> 
> I would. I’m a little disappointed that ghc-devs does not. I wouldn’t expect it to compile quickly, but I would expect it to run.

To satisfy such a demand GHC would need to have linear space usage with respect to the input program size. This implies it must also be linear with respect to the number of top-level declarations, number of variables, number of quantifiers in type sigs, and any other countable thing in the input program. It would also need to be linear for other finite resources that might run out, like symbol table entries. If you had 1Gig top-level foreign exported declarations in the source program I suspect the ELF linker would freak out.

I'm not trying to be difficult or argumentative -- I think limits like these come naturally with a concrete implementation.

I agree it's sad that client programmers can't just enable -O2 and expect every program to work. It'd be nice to have optimisation levels that give resource or complexity guarantees, like "enabling this won't make the code-size non-linear in the input size", but that's not how it works at the moment. I'm not aware of any compiler for a "high level" language that gives such guarantees, but there may be some. I'd be interested to know of any.


>> Adding an INLINE pragma is akin to using compile-time meta programming.
> 
> Is it? I find that a strange point of view. Isn’t INLINE just a strong hint to the compiler that this function should be inlined? How is using INLINE any different from simply manually inserting the code at every call site?

It's not a "hint" -- it *forces* inlining at every call site like you said. It'll make a new copy of the function body for every call site, and not back-out if the program gets "too big".

Suppose:

f x = g x ... g x' ... g x''
g y = h y ... h y' ... h y''
h z = i z ... i z' ... i z''
...

now force inlining for all of f g h etc. I'd expect to see at least 3*3*3=27 copies of the body of 'i' in the core program, and even more if SpecConstr and the LiberateCase transform are turned on. We had (and have) big problems like this with DPH. It took too long for the DPH team to unlearn the dogma that "inlining and call pattern specialisation make the program better".

Ben.




More information about the ghc-devs mailing list