[GHC] #10296: Segfaults when using dynamic wrappers and concurrency

GHC ghc-devs at haskell.org
Mon Apr 13 13:13:24 UTC 2015


#10296: Segfaults when using dynamic wrappers and concurrency
-------------------------------------+-------------------------------------
        Reporter:  bitonic           |                   Owner:
            Type:  bug               |                  Status:  new
        Priority:  normal            |               Milestone:
       Component:  Compiler          |                 Version:  7.11
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |            Architecture:
 Type of failure:  None/Unknown      |  Unknown/Multiple
      Blocked By:                    |               Test Case:
 Related Tickets:                    |                Blocking:
                                     |  Differential Revisions:
-------------------------------------+-------------------------------------
Description changed by bitonic:

Old description:

> I had a largish program that sometimes segfaulted, the segfault seemingly
> coming from the code that gets a C pointer from an Haskell function.
>
> After much sweat I've managed to produce a self-contained program that
> exhibits the same behavior:
>
> {{{
> bitonic at clay /tmp/ptr-crash % uname -a
> Linux clay 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015
> x86_64 x86_64 x86_64 GNU/Linux
> bitonic at clay /tmp/ptr-crash % cabal configure --disable-library-profiling
> -w ghc-7.11.20150411
> Resolving dependencies...
> Configuring ptr-crash-0...
> bitonic at clay /tmp/ptr-crash % cabal build
> Building ptr-crash-0...
> Preprocessing executable 'ptr-crash' for ptr-crash-0...
> [1 of 1] Compiling Main             ( Main.hs, dist/build/ptr-crash/ptr-
> crash-tmp/Main.o )
> Linking dist/build/ptr-crash/ptr-crash ...
> bitonic at clay /tmp/ptr-crash % strace -f -r -o strace-out ./dist/build
> /ptr-crash/ptr-crash +RTS -N2 -RTS
> [1]    26612 segmentation fault (core dumped)  strace -f -r -o strace-out
> ./dist/build/ptr-crash/ptr-crash +RTS -N2 -RTS
> }}}
>
> I'm running GHC HEAD on a Linux 64bit machine.  In the larger program,
> I'm pretty sure the segfaults happened on GHC 7.8.4 too, but currently I
> can reproduce it only on 7.10 and later.
>
> More details (thanks to Sergei Trofimovich on #ghc for helping me in
> investigating this):
>
> * The segfault only happens when using `-N2` or more.
> * Curiously, the segfault seems to happen much more often when compiling
> the program with `-g`.
> * `strace`ing the program when segfaulting shows that all the threads
> crash together right after some calls to `mremap`.  I've attached the end
> of the output of `strace`.
> * `gdb`ing the program and breaking on `mremap` shows that all the calls
> to `mremap` originate from `getStablePtr`.  I've attached a run of `gdb`
> that shows this pattern.
>
> Sergei had a hunch that this had to do with thread-unsafe calls to
> `stgReallocBytes` in `enlargeStablePtrTable`.

New description:

 I had a largish program that sometimes segfaulted, the segfault seemingly
 coming from the code that gets a C pointer from an Haskell function.

 After much sweat I've managed to produce a self-contained program that
 exhibits the same behavior:

 {{{
 bitonic at clay /tmp/ptr-crash % uname -a
 Linux clay 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015
 x86_64 x86_64 x86_64 GNU/Linux
 bitonic at clay /tmp/ptr-crash % cabal configure --disable-library-profiling
 -w ghc-7.11.20150411
 Resolving dependencies...
 Configuring ptr-crash-0...
 bitonic at clay /tmp/ptr-crash % cabal build
 Building ptr-crash-0...
 Preprocessing executable 'ptr-crash' for ptr-crash-0...
 [1 of 1] Compiling Main             ( Main.hs, dist/build/ptr-crash/ptr-
 crash-tmp/Main.o )
 Linking dist/build/ptr-crash/ptr-crash ...
 bitonic at clay /tmp/ptr-crash % strace -f -r -o strace-out ./dist/build/ptr-
 crash/ptr-crash +RTS -N2 -RTS
 [1]    26612 segmentation fault (core dumped)  strace -f -r -o strace-out
 ./dist/build/ptr-crash/ptr-crash +RTS -N2 -RTS
 }}}

 I'm running GHC HEAD on a Linux 64bit machine.  In the larger program, I'm
 pretty sure the segfaults happened on GHC 7.8.4 too, but currently I can
 reproduce it only on 7.10 and later.

 More details (thanks to Sergei Trofimovich on #ghc for helping me in
 investigating this):

 * The segfault only happens when using `-N2` or more.
 * Curiously, the segfault seems to happen much more often when compiling
 the program with `-g`.
 * The segfault doesn't happen every time, I get it roughly half of the
 times on my machine.
 * `strace`ing the program when segfaulting shows that all the threads
 crash together right after some calls to `mremap`.  I've attached the end
 of the output of `strace`.
 * `gdb`ing the program and breaking on `mremap` shows that all the calls
 to `mremap` originate from `getStablePtr`.  I've attached a run of `gdb`
 that shows this pattern.
 * The segfault only happens with repeated calls to the dynamic wrapper and
 with certain timings, which explains the weird nature of the example (I
 kind of mimicked the behaviour of a C function we were calling from a
 proprietary C library).  Note that the call to `sum_arr` is not really
 important and it's there just so that some time is spent in the callback
 -- the example works equally well if we convert the pointer to an Haskell
 vector and sum it from Haskell.

 Sergei had a hunch that this had to do with thread-unsafe calls to
 `stgReallocBytes` in `enlargeStablePtrTable`.

--

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10296#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list