alignment and the evil gc assertion failure
Evan Laforge
qdunkan at gmail.com
Mon Sep 6 14:03:32 EDT 2010
So a long time ago (I think when 6.10 first came out, the problem
didn't happen with the previous version, and I think 6.10 changed how
the FFI used alignment) I filed a ghc ticket about a gc assertion
failure. Unfortunately it was so hard to reproduce and reduce to a
manageable example that I wound up thinking I had fixed it and closing
it as user error. However, I'm pretty sure I tracked down what the
problem was since I changed something and I haven't had that crash
since.
To recap, the crash is an assertion failure in the gc, I've seen two positions:
seq: internal error: ASSERTION FAILED: file
rts/dist/build/sm/Evac_thr.c, line 298
seq: internal error: ASSERTION FAILED: file
rts/dist/build/sm/Evac_thr.c, line 369
The problem was in the marshalling of a certain struct. The struct
has 3 Color fields and two char fields. The Colors are simply triples
of chars. I'd been using an alignment macro I've seen around:
#let alignment t = "%lu", (unsigned long)offsetof(struct {char x__; t
(y__); },y__)
Since my struct is made out of chars, this macro returns an alignment
of 1. However, an alignment of 1 seems to be what leads to the GC
crash. After setting the alignment to 4 I've never had the crash
again.
So... what's going on here? Is the macro wrong? Alignment 1 seems
correct for something built of chars... or does a 'struct { char,
char, char }' embedded in another struct turn it into alignment 4
somehow?
Alignment seems like a particularly problematic part of the FFI, which
is in all other respects very easy to use. It's low level, poorly
understood (well, by me at least), not super well documented, and one
little mistake can either be harmless, or lead to a *very* hard to
track down bug.
If the alignment macro is correct, can it be built into hsc2hs? Copy
and pasting some magic I saw on the net into every hsc file doesn't
give me a good feeling. If it's incorrect, what would a correct one
be? One odd thing about that macro is that if you mis-spell a field
name, gcc gets a bus error (OS X 10.5.8, gcc 4.0.1, does anyone else
see this?) Or is the alignment correct and it really is a ghc bug?
If so, how can I help track it down? The best I can think of is to
make a small program with that same data structure, pass it around
some, and then generate and collect a lot of garbage, but this bug has
been really hard to pin down in the past, change one little thing and
it disappears only to pop up again in 6 months.
More information about the Glasgow-haskell-users
mailing list