Should exhaustiveness testing be on by default?

Sun Jun 7 07:23:57 EDT 2009

>>>> One thing that wasn't available when this discussion was last active
>>>> is 'mapException' (btw, similar to 'catch'/'catches', a 'mapExceptions'
>>>> would be useful). 

>> so for mere traces, dynamic seems to be the choice (with an option of 
>> pseudo-cbv or the real dynamic stack).
> 
> I don't know what pseudo-cbv is. 

oops, forward reference - explained later in the same email: pretend
that evaluation is call-by-value and produce stack for that (instead of
the actuall call-by-need stack which can be confusing).

> And I claim the dynamic stack is almost never what you want.

Maybe so - I'd be interested in an example on which this claim is based.

Perhaps I've been misunderstanding what you mean by "lexical stack"?
"lexical" to me implies only scope information, nothing related to run
time call chains, which would be "dynamic". In the "dynamic" case, one
can then distinguish between call-by-need stack (what actually happens
in GHC) and call-by-value stack (pretend that everything is strict).

What the cost-centre stack delivers appears to be more than scopes,
and less than a full static call graph (which would have to include non
deterministic branches, since the actual choice of branches depends on
runtime information) - it seems to use runtime information to give a slice 
of the full call graph (eg, not all call sites that could call the current function, 
but only the one that did so in the current run)?

>> Here are the +RTS -xc and mapException outputs together (..
>> - they seem to complement each other (-xc has no locations, but 
>> names for the lexical stack; mapError has no names, but locations 
>> for the dynamic stack; we're still missing the parameters for either stack):
> 
> I'm not claiming that +RTS -xc in its present form is what you want. 
> I'm interested in finding an underlying mechanism that allows the right 
> information to be obtained; things like source locations and free 
> variables are just decoration.

And I'm saying that adding mapException annotations is a way to get
there, with very little extra effort (just the get-the-source-location part
of the finding-the-needle transformation would be sufficient).

> Now, most of the existing methods have problems with CAFs.  I 
> doubt that the problems with CAFs are fixable using the source-to-source 
> transformation methods, but I think they might be fixable using 
> cost-centre stacks.

One of the nice things about my suggestion to mix an "annotate
with mapException" transformation with cost-centre stacks is that
it would cover CAFs as well. As another example, I tried the
nofib-buggy:anna test case discussed at 

http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack/StackTraceExperience

which does have just such a CAF problem (the error is caused in a 
local CAF, and raised in a standard library CAF), amongst other nasty
features (no CPP/sed only transformation will handle infix applications,
the initial error message doesn't even tell us where to start annotating).

Starting from the cost-centre approach (which fails on this example), I 
proceeded as follows:

- the initial error is "<GHC.Err.CAF>Main: divide by zero"
- this doesn't tell us where the issue is, so we have to annotate 
    all calls to 'div', which we do by wrapping in 'mapException',
    adding some location information to any exceptions raised (*)
- the changes are

$ darcs wh
hunk ./real/anna/SmallerLattice.hs 229
-             = let diff = ((n2 * 10) `div` n1) - scaleup_ratio
-                   scaleFact = n2 `div` 10
+             = let diff = mapError "local_cost:diff" $ ((n2 * 10) `div` n1) - scaleup_ratio
+                   scaleFact = mapError "local_cost:scaleFact" $ n2 `div` 10
hunk ./real/anna/Utils.hs 38
+import Control.Exception
hunk ./real/anna/Utils.hs 105
-            = let k    = s1 `div` 53668
+            = let k    = mapError "rands:k" $ s1 `div` 53668
hunk ./real/anna/Utils.hs 109
-                  k'   = s2 `div` (s1' - s1')
+                  k'   = mapError "rands:k'" $ s2 `div` (s1' - s1')
hunk ./real/anna/Utils.hs 635
+
+mapError src = mapException (\(SomeException e)->ErrorCall $ show e++"\n"++src)
+errorSrc src = error . (++"\n"++src)

- I added these annotations by hand - any automated transformation could
    easily be more precise about the source location messages

- output of annotated version (reformatted, as for the others on that page):

    <GHC.Err.CAF>
    <Utils.mapError,
     Utils.utRandomInts,
     FrontierDATAFN2.fdFs2,
     FrontierDATAFN2.fdFind,
     FrontierGENERIC2.fsMakeFrontierRep,
     StrictAn6.saFixAtSizeLive,
     StrictAn6.saFixMain,
     StrictAn6.saFixStartup,
     StrictAn6.saGroups,
     StrictAn6.saMain,
     Main.maStrictAn,
     Main.main,
     GHC.Err.CAF>
    Main.exe: divide by zero
    rands:k'

This appears to be in the same ballpark as what the more complex
transformations (hat, finding-the-needle) deliver.

Claus

(*) I took a shortcut here, by mapping all exceptions to 'ErrorCalls';
    if the original code contained exception handlers, one would have
    to be more careful, generically extending all exceptions without
    changing their types; but this has to be done only once, and the
    extended 'mapError' would go into the standard debug libraries.