[Haskell-cafe] How do I debug this RTS segfault?

Carter Schonwald carter.schonwald at gmail.com
Tue Jul 26 00:40:50 UTC 2016


Fork process is very very different from forkIo and fork os.  Have you
tried fork bombing from shell with a similar program? I don't think your os
can handle 2^1000 process ids? Right? I seem to reall process ids being 32
or 64 bit.

On Sunday, July 24, 2016, Lana Black <lanablack at amok.cc> wrote:

> On 21:25 Sun 24 Jul     , Anatoly Yakovenko wrote:
> > It's probably out of file descriptors. It's possible that it tries to
> open
> > another one during the error handling.
> > On Sun, Jul 24, 2016 at 10:50 AM Lana Black <lanablack at amok.cc> wrote:
> >
> > > Hello,
> > >
> > > I have run into this RTS bug recently. In short, when executing
> multiple
> > > consequtive forks, after 500-600 or so the process is terminated by
> > > SIGSEGV. I know this kind of thing is totally artificial, but still.
> > >
> > > The problem I have is that I can't get any meaningful backtrace in gdb.
> > > For example, for threaded RTS I get this
> > >
> > > (gdb) bt
> > > #0  0x0000000000560d63 in
> > > base_GHCziEventziThread_ensureIOManagerIsRunning1_info ()
> > > Backtrace stopped: Cannot access memory at address 0x7fffff7fcea0
> > >
> > > For non-threaded RTS I get this
> > >
> > > (gdb) bt
> > > #0  0x00000000007138c9 in stg_makeStablePtrzh ()
> > > Backtrace stopped: Cannot access memory at address 0x7fffff7fc720
> > >
> > > Build command: ghc --make -O2 -g -fforce-recomp fork.hs
> > > Add threaded if needed.
> > >
> > > I was able to reproduce this bug with both GHC 7.10.3 and todays HEAD
> > > with the code below.
> > >
> > > >import System.Exit (exitSuccess)
> > > >import System.Posix.Process (forkProcess)
> > > >
> > > >fork_ n | n > 0 = processPid =<< forkProcess (fork_ $! n - 1)
> > > >        | otherwise = putStrLn "I'm done!"
> > > >
> > > >processPid pid | pid  > 0 = exitSuccess
> > > >               | pid  < 0 = putStrLn "OOOPS, forkProcess failed!"
> > > >               | otherwise = pure ()
> > > >
> > > >main = fork_ 1000
> > > >
> > >
> > > With best regards.
> > > _______________________________________________
> > > Haskell-Cafe mailing list
> > > To (un)subscribe, modify options or view archives go to:
> > > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> > > Only members subscribed via the mailman list are allowed to post.
>
> Seems like this is not the case. I actually overlooked GHCs -debug
> option, with it I'm now able to get a stacktrace. Furthermore, the
> number of used file descriptors is well within the limit, and changing
> the latter with `ulimit -n` does not affect the outcome.
>
> Curiously, the stacks are rather different for threaded and non-threaded
> RTS.
>
> Non-threaded:
> (gdb) bt
> #0  INFO_PTR_TO_STRUCT (info=<error reading variable: Cannot access
> memory at address 0x7fffff7feff0>) at
> includes/rts/storage/ClosureMacros.h:60
> #1  0x000000000070e956 in get_itbl (c=0x20006e7f8) at
> includes/rts/storage/ClosureMacros.h:87
> #2  0x000000000070ec3c in closure_sizeW (p=0x20006e7f8) at
> includes/rts/storage/ClosureMacros.h:439
> #3  0x000000000070ecf7 in overwritingClosure (p=0x20006e7f8) at
> includes/rts/storage/ClosureMacros.h:555
> #4  0x0000000000725dd7 in stg_upd_frame_info ()
> #5  0x0000000000000000 in ?? ()
>
> Threaded:
> (gdb) bt
> #0  0x00007ffff6ce49ce in _IO_vfprintf_internal (s=s at entry=0x7fffff7ff430,
> format=format at entry=0x7ffff75c3550 "/proc/self/task/%u/comm", ap=ap at entry
> =0x7fffff7ff558)
>     at vfprintf.c:1266
> #1  0x00007ffff6d0954b in __IO_vsprintf (string=0x7fffff7ff630
> "`\366\177\377\377\177", format=0x7ffff75c3550 "/proc/self/task/%u/comm",
> args=args at entry=0x7fffff7ff558)
>     at iovsprintf.c:42
> #2  0x00007ffff6cecd47 in __sprintf (s=s at entry=0x7fffff7ff630
> "`\366\177\377\377\177", format=format at entry=0x7ffff75c3550
> "/proc/self/task/%u/comm") at sprintf.c:32
> #3  0x00007ffff75c1f2b in pthread_setname_np (th=140737317025536,
> name=0x78ba04 "ghc_ticker") at
> ../sysdeps/unix/sysv/linux/pthread_setname.c:49
> #4  0x000000000072ce4e in initTicker (interval=10000000,
> handle_tick=0x71a23d <handle_tick>) at rts/posix/itimer/Pthread.c:173
> #5  0x000000000071a32f in initTimer () at rts/Timer.c:111
> #6  0x0000000000703c26 in forkProcess (entry=0x207) at rts/Schedule.c:2072
> #7  0x0000000000405bf7 in s7dF_info ()
> #8  0x0000000000000000 in ?? ()
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20160725/1d94b8bb/attachment.html>


More information about the Haskell-Cafe mailing list