[GHC] #15544: Non-deterministic segmentation fault in cryptohash-sha256 testsuite
GHC
ghc-devs at haskell.org
Tue Sep 11 09:06:01 UTC 2018
#15544: Non-deterministic segmentation fault in cryptohash-sha256 testsuite
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: (none)
Type: bug | Status: new
Priority: highest | Milestone: 8.6.1
Component: Compiler | Version: 8.4.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by osa1):
I managed to reproduce it and did some debugging.
Here's the problem. We have this object:
{{{
>>> print *((StgClosure *) 0xe4c558)
$21 = {
header = {
info = 0x409968 <reFi_info>
},
payload = 0xe4c560
}
}}}
It's defined like this:
{{{
$wxs_reFi
:: GHC.Prim.Int#
-> (# Data.ByteString.Internal.ByteString,
[Data.ByteString.Internal.ByteString] #)
[GblId, Arity=1, Str=<S,1*U>, Unf=OtherCon []] =
sat-only [] \r [ww_seSe]
case ww_seSe of ds1_seSf [Occ=Once] {
__DEFAULT ->
let {
sat_seSk [Occ=Once] ::
[Data.ByteString.Internal.ByteString]
[LclId] =
[ds1_seSf] \u []
case -# [ds1_seSf 1#] of sat_seSg [Occ=Once] {
__DEFAULT ->
case $wxs_reFi sat_seSg of {
(#,#) ww2_seSi [Occ=Once] ww3_seSj
[Occ=Once] ->
: [ww2_seSi ww3_seSj];
};
};
} in (#,#) [x_reFh sat_seSk];
1# -> (#,#) [x_reFh GHC.Types.[]];
};
}}}
Notice that (1) it's a FUN_STATIC (2) it has references to another static
object x_reFh:
{{{
x_reFh :: Data.ByteString.Internal.ByteString
[GblId] =
[] \u []
case
newMutVar# [GHC.ForeignPtr.NoFinalizers GHC.Prim.realWorld#]
of
{ (#,#) ipv_seS6 [Occ=Once] ipv1_seS7 [Occ=Once] ->
case __pkg_ccall bytestring-0.10.8.2 [addr#1_reFg ipv_seS6]
of {
(#,#) _ [Occ=Dead] ds2_seSb [Occ=Once] ->
case word2Int# [ds2_seSb] of sat_seSd [Occ=Once] {
__DEFAULT ->
let {
sat_seSc [Occ=Once] ::
GHC.ForeignPtr.ForeignPtrContents
[LclId] =
CCCS GHC.ForeignPtr.PlainForeignPtr!
[ipv1_seS7];
} in
Data.ByteString.Internal.PS [addr#1_reFg
sat_seSc 0# sat_seSd];
};
};
};
}}}
The FUN_STATIC SRT optimization should apply to this object. So instead of
a SRT table we should have the SRT entries in its payload. However n_ptrs
of this object is 0:
{{{
>>> set $itbl = itbl_to_fun_itbl(get_itbl((StgClosure *) 0xe4c558))
>>> print *$itbl
$21 = {
f = {
slow_apply_offset = 59278791,
__pad_slow_apply_offset = 1572864,
b = {
bitmap = 10376465356425854976,
bitmap_offset = -907476992,
__pad_bitmap_offset = 3387490304
},
fun_type = 4,
arity = 1
},
i = {
layout = {
payload = {
ptrs = 0,
nptrs = 0
},
bitmap = 0,
large_bitmap_offset = 0,
__pad_large_bitmap_offset = 0,
selector_offset = 0
},
type = 14,
srt = 10759120,
code = 0x409968 <reFi_info> "I\203\304\030M;\245X\003"
}
}
}}}
So it seems like for some reason we don't actually do FUN_STATIC SRT
optimization for this objects. Indeed I can get the reference to refH in
the srt field:
{{{
>>> print *((StgClosure*) (((StgWord) (($itbl)+1)) + ($itbl)->i.srt)) <---
GET_FUN_SRT
$10 = {
header = {
info = 0x4097e8 <reFh_info>
},
payload = 0xe4c540
}
>>> print ((StgClosure*) (((StgWord) (($itbl)+1)) + ($itbl)->i.srt))
$11 = (StgClosure *) 0xe4c538
}}}
x_reFh is originally a THUNK and becomes IND_STATIC after evaluation:
{{{
>>> call printClosure((StgClosure *) 0xe4c538)
THUNK(0x4097e8)
>>> c
Hardware watchpoint 5: ((StgClosure *) 0xe4c538)->header.info
Old value = (const StgInfoTable *) 0x4097e8 <reFh_info>
New value = (const StgInfoTable *) 0xdce688 <stg_IND_STATIC_info>
SET_INFO (c=0xe4c538, info=0xdce688 <stg_IND_STATIC_info>) at
includes/rts/storage/ClosureMacros.h:50
50 }
>>> bt
#0 SET_INFO (c=0xe4c538, info=0xdce688 <stg_IND_STATIC_info>) at
includes/rts/storage/ClosureMacros.h:50
#1 0x0000000000dbac9b in lockCAF (reg=0x1020818 <MainCapability+24>,
caf=0xe4c538) at rts/sm/Storage.c:415
#2 0x0000000000dbacc5 in newCAF (reg=0x1020818 <MainCapability+24>,
caf=0xe4c538) at rts/sm/Storage.c:425
#3 0x0000000000409809 in reFh_info ()
#4 0x0000000000000000 in ?? ()
>>> call printClosure((StgClosure *) 0xe4c538)
IND_STATIC(0x42004d5878)
}}}
Now as long as reFi is reachable this 0xe4c538 should be reachable because
it's in SRT of reFi. Let's continue:
{{{
>>> c
... assertion failure ...
>>> bt
#0 0x0000000000db8800 in LOOKS_LIKE_INFO_PTR_NOT_NULL
(p=12297829382473034410) at includes/rts/storage/ClosureMacros.h:260
#1 0x0000000000db884f in LOOKS_LIKE_INFO_PTR (p=12297829382473034410) at
includes/rts/storage/ClosureMacros.h:265
#2 0x0000000000db8887 in LOOKS_LIKE_CLOSURE_PTR (p=0x4200122a10) at
includes/rts/storage/ClosureMacros.h:270
#3 0x0000000000db9240 in evacuate (p=0xe4c540) at rts/sm/Evac.c:516
#4 0x0000000000ddf87e in scavenge_static () at rts/sm/Scav.c:1690
#5 0x0000000000ddff0a in scavenge_loop () at rts/sm/Scav.c:2085
#6 0x0000000000db4c49 in scavenge_until_all_done () at rts/sm/GC.c:1088
#7 0x0000000000db38ba in GarbageCollect (collect_gen=1,
do_heap_census=false, gc_type=0, cap=0x1020800 <MainCapability>,
idle_cap=0x0) at rts/sm/GC.c:416
#8 0x0000000000d995a7 in scheduleDoGC (pcap=0x7fff635d6780,
task=0x2802f60, force_major=false) at rts/Schedule.c:1799
#9 0x0000000000d98a7f in schedule (initialCapability=0x1020800
<MainCapability>, task=0x2802f60) at rts/Schedule.c:545
#10 0x0000000000d99f79 in scheduleWaitThread (tso=0x4200105388, ret=0x0,
pcap=0x7fff635d6880) at rts/Schedule.c:2533
#11 0x0000000000da8b4c in rts_evalLazyIO (cap=0x7fff635d6880, p=0xe4d928,
ret=0x0) at rts/RtsAPI.c:530
#12 0x0000000000da9297 in hs_main (argc=7, argv=0x7fff635d6a78,
main_closure=0xe4d928, rts_config=...) at rts/RtsMain.c:72
#13 0x000000000041210c in main ()
}}}
0xe4c540 is indirectee of 0xe4c538:
{{{
>>> print &((StgInd*)0xe4c538)->indirectee
$27 = (StgClosure **) 0xe4c540
}}}
But the object was cleared (because this is in sanity mode)
{{{
>>> print *UNTAG_CLOSURE(((StgInd*)0xe4c538)->indirectee)
$29 = {
header = {
info = 0xaaaaaaaaaaaaaaaa
},
payload = 0x4200122a18
}
}}}
so it became unreachable. For this object to be unreachable reFi should be
unreachable too. Let's see if it was reachable in this GC:
{{{
>>> break GarbageCollect
Breakpoint 6 at 0xdb3492: file rts/sm/GC.c, line 226.
>>> break evacuate_static_object if q == 0xe4c558
Breakpoint 7 at 0xdb8f85: file rts/sm/Evac.c, line 333.
>>> reverse-continue
}}}
Breakpoint 7 is hit first, so it seems like reFi is actually reachable. We
should be scavenging it too:
{{{
>>> break Scav.c:1675 if p == 0xe4c558
Breakpoint 8 at 0xddf7cc: file rts/sm/Scav.c, line 1675.
>>> c
>>> bt
#0 scavenge_static () at rts/sm/Scav.c:1675
#1 0x0000000000ddff0a in scavenge_loop () at rts/sm/Scav.c:2085
#2 0x0000000000db4c49 in scavenge_until_all_done () at rts/sm/GC.c:1088
#3 0x0000000000db38ba in GarbageCollect (collect_gen=1,
do_heap_census=false, gc_type=0, cap=0x1020800 <MainCapability>,
idle_cap=0x0) at rts/sm/GC.c:416
#4 0x0000000000d995a7 in scheduleDoGC (pcap=0x7fff635d6780,
task=0x2802f60, force_major=false) at rts/Schedule.c:1799
#5 0x0000000000d98a7f in schedule (initialCapability=0x1020800
<MainCapability>, task=0x2802f60) at rts/Schedule.c:545
#6 0x0000000000d99f79 in scheduleWaitThread (tso=0x4200105388, ret=0x0,
pcap=0x7fff635d6880) at rts/Schedule.c:2533
#7 0x0000000000da8b4c in rts_evalLazyIO (cap=0x7fff635d6880, p=0xe4d928,
ret=0x0) at rts/RtsAPI.c:530
#8 0x0000000000da9297 in hs_main (argc=7, argv=0x7fff635d6a78,
main_closure=0xe4d928, rts_config=...) at rts/RtsMain.c:72
#9 0x000000000041210c in main ()
}}}
At this point if I step a few more lines I get the original assertion
error.
So in summary: a FUN_STATIC is reachable, but somehow a static object in
its SRT is collected.
Alternatively, it could be that the FUN_STATIC becomes unreachable, and
somehow become reachable again later.
Simon, I'm looking at the implementation of SRT optimization for
FUN_STATIC. I don't understand why we look for both the SRT field and
nptrs of FUN_STATICs in this code: (evacuate())
{{{
case FUN_STATIC:
if (info->srt != 0 || info->layout.payload.ptrs != 0) {
evacuate_static_object(STATIC_LINK(info,(StgClosure *)q),
q);
}
return;
}}}
As far as I understand for FUN_STATICs we should only look at the payload,
no? I think that what the note in CmmBuildInfoTables.hs says.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15544#comment:20>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list