marlowsd at gmail.com
Fri Oct 28 13:10:02 UTC 2016
I see, but the compiler has no business caching things across
requestSync(), which can in principle change anything: even if the compiler
could see all the code, it would find a pthread_condwait() in there.
Anyway I've found the problem - it was caused by a subsequent GC
overwriting the values of gc_threads.idle before the previous GC had
finished releaseGCThreads() which reads those values. Diff on the way...
On 28 October 2016 at 11:58, Ryan Yates <fryguybob at gmail.com> wrote:
> Right, it is compiler effects at this boundary that I'm worried about,
> values that are not read from memory after the changes have been made, not
> memory effects or data races.
> On Fri, Oct 28, 2016 at 3:02 AM, Simon Marlow <marlowsd at gmail.com> wrote:
>> Hi Ryan, I don't think that's the issue. Those variables can only be
>> modified in setNumCapabilities, which acquires *all* the capabilities
>> before it makes any changes. There should be no other threads running RTS
>> code(*) while we change the number of capabilities. In particular we
>> shouldn't be in releaseGCThreads while enabled_capabilities is being
>> (*) well except for the parts at the boundary with the external world
>> which run without a capability, such as rts_lock() which acquires a
>> On 27 Oct 2016 17:10, "Ryan Yates" <fryguybob at gmail.com> wrote:
>>> Briefly looking at the code it seems like several global variables
>>> involved should be volatile: n_capabilities, enabled_capabilities, and
>>> capabilities. Perhaps in a loop like in scheduleDoGC the compiler moves
>>> the reads of n_capabilites or capabilites outside the loop. A failed
>>> requestSync in that loop would not get updated values for those global
>>> pointers. That particular loop isn't doing that optimization for me, but I
>>> think it could happen without volatile.
>>> On Thu, Oct 27, 2016 at 9:18 AM, Ben Gamari <ben at smart-cactus.org>
>>>> Simon Marlow <marlowsd at gmail.com> writes:
>>>> > I haven't been able to reproduce the failure yet. :(
>>>> Indeed I've also not seen it in my own local builds. It's quite an
>>>> fragile failure.
>>>> - Ben
>>>> ghc-devs mailing list
>>>> ghc-devs at haskell.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ghc-devs