Shake fails test with GHC 7.10 RC3

George Colpitts george.colpitts at gmail.com
Sat Mar 21 16:57:50 UTC 2015


my two cents, FWIW: I'd really like to see a bindist for MacOS before we
release it

On Fri, Mar 20, 2015 at 8:16 PM, Simon Peyton Jones <simonpj at microsoft.com>
wrote:

> | FWIW, I've been noodling over this today since I saw this email, and
> | sitting down, I think this may be a blocker worth holding up for after
> | thinking about it. At least, it is until we can quantify how much it
> | might bite (can we find a workaround? How often does it hit in
> | practice?)
>
> I'm reluctant to delay 7.10 further.  I'm sure there are other bugs, and
> we'll want 7.10.2 before too long.   But we won't find them until we push
> it out.  The danger is that we stay in about-to-release mode for ages, and
> that holds everyone up.  We've had three release candidates out over a
> period of months and Neil seems to be the only one who has tripped over it.
>
> I think it'd be better to push it out.  (We planned to do so today!)  What
> do others think?
>
> Simon
>
> | -----Original Message-----
> | From: mad.one at gmail.com [mailto:mad.one at gmail.com] On Behalf Of Austin
> | Seipp
> | Sent: 20 March 2015 22:48
> | To: Herbert Valerio Riedel
> | Cc: Simon Peyton Jones; Austin Seipp
> | Subject: Re: Shake fails test with GHC 7.10 RC3
> |
> | FWIW, I've been noodling over this today since I saw this email, and
> | sitting down, I think this may be a blocker worth holding up for after
> | thinking about it. At least, it is until we can quantify how much it
> | might bite (can we find a workaround? How often does it hit in
> | practice?)
> |
> | I'm testing Iavor's patches that he wanted in at the last minute (I'm
> | really hesitant for that, since we could break something), so we've
> | got some time, but I'd like to at least narrow this down a bit more.
> |
> | On Fri, Mar 20, 2015 at 12:37 PM, Herbert Valerio Riedel
> | <hvriedel at gmail.com> wrote:
> | >
> | > ...does this in any way affect today's GHC 7.10.1 release? I.e. is this
> | > blocking the release?
> | >
> | >
> | > On 2015-03-20 at 18:19:09 +0100, Simon Peyton Jones wrote:
> | >> Neil
> | >>
> | >> Could you open a ticket for this, and attach the info to it?  Lots of
> | useful info in this thread, but may get lost.
> | >>
> | >> Re empty cases, see Note [Empty case alternatives] in CoreSyn.  "A
> | >> case expression can have empty alternatives if (and only if) the
> | >> scrutinee is bound to raise an exception or diverge."
> | >>
> | >> So yes, case (\_ -> ...) of {} is definitely wrong.  Could someone
> | >> extend Core Lint to check for this?  The function to call is
> | >> exprIsHNF: if we have
> | >>       case e of {}
> | >> then exprIsHNF should never return True.
> | >>
> | >> That is a powerful clue, and adding it to Lint will nail the moment at
> | which it first happens.  Good progress.
> | >>
> | >> (I'd do it myself but I'm struggling with backlog, and only have 3
> | days left before going on holiday for 10 days.)
> | >>
> | >> Simon
> | >>
> | >> |  -----Original Message-----
> | >> |  From: Neil Mitchell [mailto:ndmitchell at gmail.com]
> | >> |  Sent: 20 March 2015 16:57
> | >> |  To: Carter Schonwald
> | >> |  Cc: Simon Peyton Jones; ghc-devs at haskell.org
> | >> |  Subject: Re: Shake fails test with GHC 7.10 RC3
> | >> |
> | >> |  > do you use a deepSeq when catching errors from a pure
> | computation? Or
> | >> |  > is it in a context where you know the value should be treated
> | strictly?
> | >> |
> | >> |  The computation is essentially (error "here" :: IO ()), so the IO
> | monadic
> | >> |  binding should force the exception. In the full source code to
> | Shake it's
> | >> |  actually in ContT/ReaderT/IO, but a modification to insert liftIO
> | (if
> | >> |  done in exactly the right way) gives the same result. That
> | simplified
> | >> |  code is at http://community.haskell.org/~ndm/temp/src.zip
> | >> |  (to run, compile with the logic from
> | >> |  https://github.com/ndmitchell/shake/blob/master/.ghci :benchmark,
> | then
> | >> |  run "main oracle test").
> | >> |
> | >> |  > https://github.com/ndmitchell/shake/issues/216 seems to be the
> | ticket
> | >> |  > in question
> | >> |
> | >> |  That is the ticket. The test that is failing is the "oracle"
> | >> |  (Test.Oracle) test suite, and you can see full logs of both success
> | and
> | >> |  failure at Travis:
> | >> |  https://travis-ci.org/ndmitchell/shake/builds/54748475 . Note that
> | it
> | >> |  succeeds on GHC 7.2, 7.4, 7.6 and 7.8, but fails on GHC 7.10 RC3
> | and GHC
> | >> |  Head.
> | >> |
> | >> |  Alas, the actual error is coming from deep in the middle of Shake,
> | >> |  specifically this line:
> | >> |
> |
> https://github.com/ndmitchell/shake/blob/master/src/Development/Shake/Cor
> | >> |  e.hs#L329
> | >> |
> | >> |  Thinking some more on what I saw, my best guess is that:
> | >> |
> | >> |  case (\_ -> ...) of {}
> | >> |
> | >> |  Is a fatal error in GHC. I am guessing that {} means there are no
> | >> |  alternatives because GHC managed to show that the scrutinee throws
> | an
> | >> |  exception. In addition, evaluating a literal lambda to WHNF will
> | not
> | >> |  throw an exception, so you have a contradiction.
> | >> |
> | >> |  Making increasingly wild speculative guesses, I could imagine this
> | code
> | >> |  compiling to CMM that just fell off the end of a basic block,
> | arriving at
> | >> |  the next statement of supposedly unconnected code. That actually
> | matches
> | >> |  what I'm seeing - I have the exception, using print I can see I am
> | at the
> | >> |  line before, and that I never get to the line after, but pop up
> | somewhere
> | >> |  nearby and continue onwards seemingly ignoring the exception ever
> | >> |  happened.
> | >> |
> | >> |  If someone can confirm the above pattern of Core is always illegal
> | and
> | >> |  always indicates a GHC error then I can reduce the test case driven
> | >> |  purely by the Core, which would be far easier - at the moment Shake
> | >> |  passes through these routines several times to set things up,
> | making it
> | >> |  harder to simplify them too much without changing the preparation
> | steps.
> | >> |
> | >> |  Thanks, Neil
> | >> |
> | >> |
> | >> |  >>
> | >> |  >> On Fri, Mar 20, 2015 at 12:01 AM, Neil Mitchell
> | >> |  >> <ndmitchell at gmail.com>
> | >> |  >> wrote:
> | >> |  >> > More delving later, it seems the incorrect optimized version
> | has
> | >> |  >> > been turned into:
> | >> |  >> >
> | >> |  >> > case (\ _ [Occ=Dead, OS=OneShot] -> error "here") of _
> | [Occ=Dead]
> | >> |  >> > {}
> | >> |  >> >
> | >> |  >> > While the working one has been turned into:
> | >> |  >> >
> | >> |  >> > errorFunc argument realWorldToken
> | >> |  >> >
> | >> |  >> > where errorFunc _ = error "here"
> | >> |  >> >
> | >> |  >> > I'm not familiar with case ... of _ {} - what does it mean
> | when
> | >> |  >> > there are no alternatives to the case? And why isn't case on a
> | >> |  >> > literal lambda optimised out? Is it the OneShot annotation
> | (perhaps
> | >> |  >> > coming from the state hack?)
> | >> |  >> >
> | >> |  >> > The full trace is at
> | >> |  >> > https://gist.github.com/ndmitchell/b222e04eb0c3a397c758. I've
> | >> |  >> > uploaded bad (optimises the error out) and good (works as
> | expected)
> | >> |  >> > versions of the Core. The summary files are the subexpression
> | that
> | >> |  >> > changed making a single difference (moving a monomorphic
> | NOINLINE
> | >> |  >> > function from one module to another) plus the handful of
> | functions
> | >> |  >> > they depend on, which I've reformatted/inlined/simplified to
> | produce
> | >> |  the expressions above.
> | >> |  >> > The full versions are the entire -ddump-simpl output from
> | about
> | >> |  >> > halfway through the build, starting when the differences occur
> | -
> | >> |  >> > let me know if you need further back. The dodgy function is
> | "exec".
> | >> |  >> >
> | >> |  >> > Thanks, Neil
> | >> |  >> >
> | >> |  >> > On Thu, Mar 19, 2015 at 11:07 PM, Neil Mitchell
> | >> |  >> > <ndmitchell at gmail.com>
> | >> |  >> > wrote:
> | >> |  >> >> Herbert, thanks for the list of patches, nothing obvious
> | there -
> | >> |  >> >> my best guess is it's something incredibly sensitive and it
> | only
> | >> |  >> >> needs the tiniest change anywhere to make it happen. Things
> | like
> | >> |  >> >> moving NOINLINE monomorphic-type definitions from one module
> | to
> | >> |  >> >> another are causing the bug to appear/disappear, which isn't
> | what
> | >> |  I'd expect.
> | >> |  >> >>
> | >> |  >> >> Simon, changing from error to error in IO causes the bug to
> | >> |  >> >> disappear, but then so do most things. The error return type
> | is
> | >> |  >> >> type IO (), so I suspect that forces it to be raised at the
> | right
> | >> |  >> >> place - but it's certainly one of the possibilities for what
> | is
> | >> |  >> >> going wrong. Diffing the Core is a great idea.
> | >> |  >> >>
> | >> |  >> >> I'll keep reducing and see what I get to. Given the
> | sensitivity of
> | >> |  >> >> the bug, I'm sure a NOINLINE on an out-of-the-way function
> | will
> | >> |  >> >> make it go away, so I can easily fix Shake itself - so I'm
> | more
> | >> |  >> >> tracking it down from the point of GHC now.
> | >> |  >> >>
> | >> |  >> >> Thanks, Neil
> | >> |  >> >>
> | >> |  >> >>
> | >> |  >> >> On Wed, Mar 18, 2015 at 5:04 PM, Simon Peyton Jones
> | >> |  >> >> <simonpj at microsoft.com> wrote:
> | >> |  >> >>> I'm really sorry but I can't think of anything.  Sounds
> | horrible.
> | >> |  >> >>>
> | >> |  >> >>> If you throw exceptions using 'error' (not in IO), then you
> | are
> | >> |  >> >>> of course vulnerable to strictness changes.  If the thing
> | isn't
> | >> |  >> >>> actually evaluated inside the catch block, you won't see the
> | >> |  >> >>> exception.  But I'm sure you've thought of that.
> | >> |  >> >>>
> | >> |  >> >>> I'd experiment with one of the smaller changes you describe,
> | such
> | >> |  >> >>> as adding a putStrLn, and comparing Core, before and after.
> | >> |  >> >>> Switching off -O will make a huge difference, so hard to
> | compare.
> | >> |  >> >>> Turning off the state hack will have a more global effect.
> | But
> | >> |  >> >>> the other changes sound more pin-point and hence the
> | differences
> | >> |  will be smaller.
> | >> |  >> >>>
> | >> |  >> >>> Simon
> | >> |  >> >>>
> | >> |  >> >>> |  -----Original Message-----
> | >> |  >> >>> |  From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On
> | Behalf
> | >> |  >> >>> | Of
> | >> |  >> >>> Neil
> | >> |  >> >>> |  Mitchell
> | >> |  >> >>> |  Sent: 18 March 2015 15:33
> | >> |  >> >>> |  To: ghc-devs at haskell.org
> | >> |  >> >>> |  Subject: Shake fails test with GHC 7.10 RC3
> | >> |  >> >>> |
> | >> |  >> >>> |  Hi,
> | >> |  >> >>> |
> | >> |  >> >>> |  Testing GHC 7.10 RC3 I've found a bug where Shake seems
> | to
> | >> |  >> >>> | catch
> | >> |  >> >>> the
> | >> |  >> >>> |  wrong exception in the wrong place. It's only hit by one
> | of my
> | >> |  >> >>> tests,
> | >> |  >> >>> |  and I've managed to isolate it to a fragment of code with
> | no
> | >> |  >> >>> | unsafePerformIO, that throws exceptions using error (so
> | not in
> | >> |  >> >>> | IO),
> | >> |  >> >>> and
> | >> |  >> >>> |  operates in IO. Turning off the stack hack makes the bug
> | go
> | >> |  >> >>> | away,
> | >> |  >> >>> but
> | >> |  >> >>> |  then so does -O0, marking one of the functions it calls
> | >> |  >> >>> | NOINLINE,
> | >> |  >> >>> or
> | >> |  >> >>> |  moving an INLINE function it calls to a different module,
> | or
> | >> |  >> >>> | adding
> | >> |  >> >>> a
> | >> |  >> >>> |  putStrLn under a catch block - it's very sensitive to the
> | >> |  >> >>> | exact  conditions. This test and this exact code worked
> | fine
> | >> |  >> >>> | with GHC 7.10  RC2.
> | >> |  >> >>> |
> | >> |  >> >>> |  I was wondering if there have been any state hack related
> | >> |  >> >>> | changes
> | >> |  >> >>> or
> | >> |  >> >>> |  other potentially dangerous optimisation changes since
> | RC2?
> | >> |  >> >>> | I'll  continue to try reducing the bug, but it's somewhat
> | >> |  >> >>> | difficult as
> | >> |  >> >>> the
> | >> |  >> >>> |  larger system is quite big, and the code is very
> | sensitive.
> | >> |  >> >>> |
> | >> |  >> >>> |  Thanks, Neil
> | >> |  >> >>> |  _______________________________________________
> | >> |  >> >>> |  ghc-devs mailing list
> | >> |  >> >>> |  ghc-devs at haskell.org
> | >> |  >> >>> |
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> | >> |  >> >
> | >> |  >> > On Thu, Mar 19, 2015 at 11:24 PM, Simon Peyton Jones
> | >> |  >> > <simonpj at microsoft.com> wrote:
> | >> |  >> >> Thanks!  I think a -ddump-simpl before and after the smallest
> | >> |  >> >> change the makes the difference would be illuminating.
> | >> |  >> >>
> | >> |  >> >> Simon
> | >> |  >> >>
> | >> |  >> >> | -----Original Message-----
> | >> |  >> >> | From: Neil Mitchell [mailto:ndmitchell at gmail.com]
> | >> |  >> >> | Sent: 19 March 2015 23:07
> | >> |  >> >> | To: Simon Peyton Jones
> | >> |  >> >> | Cc: ghc-devs at haskell.org
> | >> |  >> >> | Subject: Re: Shake fails test with GHC 7.10 RC3
> | >> |  >> >> |
> | >> |  >> >> | Herbert, thanks for the list of patches, nothing obvious
> | there -
> | >> |  >> >> | my best guess is it's something incredibly sensitive and it
> | only
> | >> |  >> >> | needs the tiniest change anywhere to make it happen. Things
> | like
> | >> |  >> >> | moving NOINLINE monomorphic-type definitions from one
> | module to
> | >> |  >> >> | another are causing the bug to appear/disappear, which
> | isn't what
> | >> |  I'd expect.
> | >> |  >> >> |
> | >> |  >> >> | Simon, changing from error to error in IO causes the bug to
> | >> |  >> >> disappear,
> | >> |  >> >> | but then so do most things. The error return type is type
> | IO (),
> | >> |  >> >> | so I suspect that forces it to be raised at the right place
> | -
> | >> |  >> >> | but it's certainly one of the possibilities for what is
> | going
> | >> |  >> >> | wrong. Diffing the Core is a great idea.
> | >> |  >> >> |
> | >> |  >> >> | I'll keep reducing and see what I get to. Given the
> | sensitivity
> | >> |  >> >> | of
> | >> |  >> >> the
> | >> |  >> >> | bug, I'm sure a NOINLINE on an out-of-the-way function will
> | make
> | >> |  >> >> | it
> | >> |  >> >> go
> | >> |  >> >> | away, so I can easily fix Shake itself - so I'm more
> | tracking it
> | >> |  >> >> | down from the point of GHC now.
> | >> |  >> >> |
> | >> |  >> >> | Thanks, Neil
> | >> |  >> >> |
> | >> |  >> >> |
> | >> |  >> >> | On Wed, Mar 18, 2015 at 5:04 PM, Simon Peyton Jones
> | >> |  >> >> | <simonpj at microsoft.com> wrote:
> | >> |  >> >> | > I'm really sorry but I can't think of anything.  Sounds
> | >> |  horrible.
> | >> |  >> >> | >
> | >> |  >> >> | > If you throw exceptions using 'error' (not in IO), then
> | you
> | >> |  >> >> | > are of
> | >> |  >> >> | course vulnerable to strictness changes.  If the thing
> | isn't
> | >> |  >> >> | actually evaluated inside the catch block, you won't see
> | the
> | >> |  >> >> | exception.  But
> | >> |  >> >> I'm
> | >> |  >> >> | sure you've thought of that.
> | >> |  >> >> | >
> | >> |  >> >> | > I'd experiment with one of the smaller changes you
> | describe,
> | >> |  >> >> | > such
> | >> |  >> >> as
> | >> |  >> >> | adding a putStrLn, and comparing Core, before and after.
> | >> |  >> >> | Switching
> | >> |  >> >> off -
> | >> |  >> >> | O will make a huge difference, so hard to compare.  Turning
> | off
> | >> |  >> >> | the
> | >> |  >> >> state
> | >> |  >> >> | hack will have a more global effect.  But the other changes
> | >> |  >> >> | sound
> | >> |  >> >> more
> | >> |  >> >> | pin-point and hence the differences will be smaller.
> | >> |  >> >> | >
> | >> |  >> >> | > Simon
> | >> |  >> >> | >
> | >> |  >> >> | > |  -----Original Message-----
> | >> |  >> >> | > |  From: ghc-devs [mailto:ghc-devs-bounces at haskell.org]
> | On
> | >> |  >> >> | > | Behalf
> | >> |  >> >> Of
> | >> |  >> >> | Neil
> | >> |  >> >> | > |  Mitchell
> | >> |  >> >> | > |  Sent: 18 March 2015 15:33
> | >> |  >> >> | > |  To: ghc-devs at haskell.org
> | >> |  >> >> | > |  Subject: Shake fails test with GHC 7.10 RC3
> | >> |  >> >> | > |
> | >> |  >> >> | > |  Hi,
> | >> |  >> >> | > |
> | >> |  >> >> | > |  Testing GHC 7.10 RC3 I've found a bug where Shake
> | seems to
> | >> |  >> >> | > | catch
> | >> |  >> >> the
> | >> |  >> >> | > |  wrong exception in the wrong place. It's only hit by
> | one of
> | >> |  >> >> | > | my
> | >> |  >> >> | tests,
> | >> |  >> >> | > |  and I've managed to isolate it to a fragment of code
> | with
> | >> |  >> >> | > | no  unsafePerformIO, that throws exceptions using error
> | (so
> | >> |  >> >> | > | not in
> | >> |  >> >> IO),
> | >> |  >> >> | and
> | >> |  >> >> | > |  operates in IO. Turning off the stack hack makes the
> | bug go
> | >> |  >> >> away,
> | >> |  >> >> | but
> | >> |  >> >> | > |  then so does -O0, marking one of the functions it
> | calls
> | >> |  >> >> NOINLINE, or
> | >> |  >> >> | > |  moving an INLINE function it calls to a different
> | module,
> | >> |  >> >> | > | or
> | >> |  >> >> adding
> | >> |  >> >> | a
> | >> |  >> >> | > |  putStrLn under a catch block - it's very sensitive to
> | the
> | >> |  >> >> | > | exact  conditions. This test and this exact code worked
> | fine
> | >> |  >> >> | > | with GHC
> | >> |  >> >> 7.10
> | >> |  >> >> | > |  RC2.
> | >> |  >> >> | > |
> | >> |  >> >> | > |  I was wondering if there have been any state hack
> | related
> | >> |  >> >> changes or
> | >> |  >> >> | > |  other potentially dangerous optimisation changes since
> | RC2?
> | >> |  >> >> | > | I'll  continue to try reducing the bug, but it's
> | somewhat
> | >> |  >> >> | > | difficult as
> | >> |  >> >> the
> | >> |  >> >> | > |  larger system is quite big, and the code is very
> | sensitive.
> | >> |  >> >> | > |
> | >> |  >> >> | > |  Thanks, Neil
> | >> |  >> >> | > |  _______________________________________________
> | >> |  >> >> | > |  ghc-devs mailing list
> | >> |  >> >> | > |  ghc-devs at haskell.org
> | >> |  >> >> | > |  http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-
> | devs
> | >> |  >> _______________________________________________
> | >> |  >> ghc-devs mailing list
> | >> |  >> ghc-devs at haskell.org
> | >> |  >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> | >> |  >
> | >> |  >
> | >> _______________________________________________
> | >> ghc-devs mailing list
> | >> ghc-devs at haskell.org
> | >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> | >
> | > --
> | > "Elegance is not optional" -- Richard O'Keefe
> | >
> |
> |
> |
> | --
> | Regards,
> |
> | Austin Seipp, Haskell Consultant
> | Well-Typed LLP, http://www.well-typed.com/
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150321/0a6e172d/attachment-0001.html>


More information about the ghc-devs mailing list