[ghc-steering-committee] Please review #313: Delimited continuation primops, Shepherd: Simon Marlow

Eric Seidel eric at seidel.io
Thu Dec 3 14:50:24 UTC 2020


On Thu, Dec 3, 2020, at 05:29, Simon Marlow wrote:
> My understanding of what happened in C compilers is that the optimisers 
> want to do some dataflow analysis to determine the ranges of values 
> that variables can take, in order to eliminate unnecessary conditionals 
> and so forth. These analyses are severely crippled if you can't assume 
> the absence of undefined behaviour in C/C++, because most arithmetic 
> operations have undefined behaviour: overflow is undefined, for example.

Right, I think what leads to the surprising behavior in C/C++ is that compilers
sometimes use the results of the dataflow analysis in a manner that violates
causality.

For example, removing a NULL check because later on the pointer
is dereferenced unconditionally. I don't think anyone would complain about
the compiler removing NULL checks *after* the dereference, but doing so
before is a bit unsettling. Suppose the NULL check the compiler cleverly
elided was meant to call into a custom abort() function. Now we have to hope
the compiler can figure out that our abort() wrapper doesn't return, and we
probably have to annotate our abort() to help the compiler along.

Another example that always comes to mind is where the compiler may
replace an indirect call via a NULL function pointer with a direct call to a
function f() that is never actually assigned to the function pointer, because
it detected that f() is the only possible target of the pointer and calling a
NULL function pointer is also UB.
 
> We don't do this kind of analysis in GHC at the moment, and we don't 
> have quite so much undefined behaviour floating around. But I can 
> imagine similar issues might arise. I mean, we most definitely assume 
> that the result of `unsafeCoerce#` is a value of the claimed type - 
> what else can you do? I suppose I'm not really clear on what 
> specifically it would mean to adopt "illegal but unchecked" as the 
> policy for undefined behaviour. Does it mean the compiler can't assume 
> anything about the result of an operation that has any undefined 
> behaviour?

That's a fair point. I'm not entirely clear on where to draw the line either,
but maybe respecting causality would be a good place to start. So it's fine
for us to assume that operations that have UB complete successfully, and
to assume the inputs satisfy the operation's contract from that point on,
but maybe we should avoid propagating those assumptions back before
the call-site that may invoke UB. But then what do we do about loops? It
seems we would either have to special case the first iteration or block
optimizations in the first section of the loop on every iteration.. Both
options sound terrible..


More information about the ghc-steering-committee mailing list