Bad news: apparent bug in casMutVar going back to 7.2

Carter Schonwald carter.schonwald at gmail.com
Sat Feb 1 07:44:00 UTC 2014


https://ghc.haskell.org/trac/ghc/ticket/8724#ticket is the ticket

when i'm more awake i'll experiment some more


On Sat, Feb 1, 2014 at 2:33 AM, Carter Schonwald <carter.schonwald at gmail.com
> wrote:

> i have a ticket for tracking this, though i'm thinking my initial attempt
> at a patch generates the same object code as it did before.
>
> @ryan, what CPU variant are you testing this on? is this on a NUMA machine
> or something?
>
>
> On Sat, Feb 1, 2014 at 1:58 AM, Carter Schonwald <
> carter.schonwald at gmail.com> wrote:
>
>> woops, i mean cmpxchgq
>>
>>
>> On Sat, Feb 1, 2014 at 1:36 AM, Carter Schonwald <
>> carter.schonwald at gmail.com> wrote:
>>
>>> ok, i can confirm that on my 64bit mac, both clang and gcc use cmpxchgl
>>> rather than cmpxchg
>>> i'll whip up a strawman patch on head that can be cherrypicked / tested
>>> out by ryan et al
>>>
>>>
>>> On Sat, Feb 1, 2014 at 1:12 AM, Carter Schonwald <
>>> carter.schonwald at gmail.com> wrote:
>>>
>>>> Hey Ryan,
>>>> looking at this closely
>>>> Why isn't CAS using CMPXCHG8B on 64bit architectures?  Could that be
>>>> the culprit?
>>>>
>>>> Could the issue be that we've not had a good stress test that would
>>>> create values that are equal on the 32bit range, but differ on the 64bit
>>>> range, and you're hitting that?
>>>>
>>>> Could you try seeing if doing that change fixes things up?
>>>> (I may be completely wrong, but just throwing this out as a naive
>>>> "obvious" guess)
>>>>
>>>>
>>>> On Sat, Feb 1, 2014 at 12:58 AM, Ryan Newton <rrnewton at gmail.com>wrote:
>>>>
>>>>> Then again... I'm having trouble seeing how the spec on page 3-149 of
>>>>> the Intel manual would allow the behavior I'm seeing:
>>>>>
>>>>>
>>>>> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
>>>>>
>>>>> Nevertheless, this is exactly the behavior we're seeing with the
>>>>> current Haskell primops.  Two threads simultaneously performing the same
>>>>> CAS(p,a,b) can both think that they succeeded.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Feb 1, 2014 at 12:33 AM, Ryan Newton <rrnewton at gmail.com>wrote:
>>>>>
>>>>>> I commented on the commit here:
>>>>>>
>>>>>>
>>>>>> https://github.com/ghc/ghc/commit/521b792553bacbdb0eec138b150ab0626ea6f36b
>>>>>>
>>>>>> The problem is that our "cas" routine in SMP.h is similar to the C
>>>>>> compiler intrinsic __sync_val_compare_and_swap, in that it returns the old
>>>>>> value.  But it seems we cannot use a comparison against that old value to
>>>>>> determine whether or not the CAS succeeded.  (I believe the CAS may fail
>>>>>> due to contention, but the old value may happen to look like our old value.)
>>>>>>
>>>>>> Unfortunately, this didn't occur to me until it started causing bugs
>>>>>> [1] [2].  Fixing casMutVar# fixes these bugs.  However, the way I'm
>>>>>> currently fixing CAS in the "atomic-primops" package is by using
>>>>>> __sync_bool_compare_and_swap:
>>>>>>
>>>>>>
>>>>>> https://github.com/rrnewton/haskell-lockfree/commit/f9716ddd94d5eff7420256de22cbf38c02322d7a#diff-be3304b3ecdd8e1f9ed316cd844d711aR200
>>>>>>
>>>>>> What is the best fix for GHC itself?   Would it be ok for GHC to
>>>>>> include a C compiler intrinsic like __sync_val_compare_and_swap?  Otherwise
>>>>>> we need another big ifdbef'd function like "cas" in SMP.h that has the
>>>>>> architecture-specific inline asm across all architectures.  I can write the
>>>>>> x86 one, but I'm not eager to try the others.
>>>>>>
>>>>>> Best,
>>>>>>    -Ryan
>>>>>>
>>>>>> [1] https://github.com/iu-parfunc/lvars/issues/70
>>>>>> [2] https://github.com/rrnewton/haskell-lockfree/issues/15
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ghc-devs mailing list
>>>>> ghc-devs at haskell.org
>>>>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140201/edac9fa4/attachment.html>


More information about the ghc-devs mailing list