[nhc-bugs] Debugging a segfault

Sun Oct 23 09:40:34 EDT 2005

dons at cse.unsw.edu.au (Donald Bruce Stewart) writes:

> I need some help chasing down a segfault in nhc98 1.1{6,8} on OpenBSD/powerpc.
> The x86 version runs nicely (after the mmap patch), however both 1.16 and 1.18
> die at the following point in the build process:
>
>     /home/hack/dons/build/nhc98-1.18/script/nhc98 -c +CTS -lib  -redefine -CTS   +RTS -H32M -RTS -o /home/hack/dons/build/nhc98-1.18/targets/powerpc-OpenBSD/obj/prelude/DErrNo/DErrNo.o DErrNo.hs
>     Segmentation fault (core dumped) 

This is the first point in the build process where the freshly-built
compiler is run on Haskell source code, so it is the usual indicator of
a faulty nhc98.  Historically, segfaults here have been associated with
changes in the way gcc lays out static arrays of bytecodes, e.g. by
putting extra padding space between arrays that are supposed to be
adjacent.

What version of gcc did you use to bootstrap nhc98 with?
Another thought: is the test machine a G5 (64-bit powerpc)?
nhc98 currently only works for 32-bit machines.

nhc98 has several non-portable assumptions concerning malloc'd memory, C
compiler behaviour and so on, which frequently seem to lead to these
kinds of problem.  Most will be fixed by a forthcoming major change to
both the bytecode generator and RTS of nhc98.  But if there is something
simple we can do in the meantime to workaround the difficulty, I am open
to suggestions.

> How do I go about debugging this? gdb wasn't particularly revealing.

Unfortunately, gdb won't be very useful, because when nhc98-generated
bytecode is running, the C stack is generally not used.  All activity
takes place within the run() mutator, except for GC and FFI calls.

Although I suppose you could try looking at some of the virtual
"registers" in gdb, that is, *ip, *sp, *fp, *hp, etc.

Regards,
    Malcolm