<div dir="ltr"><div>Hi David,</div><div><br></div><div>Interesting.</div><div>I don't have an answer, but I write few things.</div><div><br></div><div>Your case is:</div><div>  * consecutive FFI calls</div><div>  * on the same Haskell Thread</div><div><br></div><div>Consecutive FFI call cases are:</div><div>  (1) do { safe_ffiCall1;   safe_ffiCall2 }</div><div>  (2) do { safe_ffiCall1;   unsafe_ffiCall2 }</div><div>  (3) do { unsafe_ffiCall1; safe_ffiCall2 }</div><div>  (4) do { unsafe_ffiCall1; unsafe_ffiCall2 }</div><div><br></div><div>I think at least answer is 'no' with case (4).</div><div>There are no memory barrier between unsafe_ffiCall1 and 2.</div><div><br></div><div><br></div><div>And apologies if I'm missing context.</div><div>Although a haskell thread can migrate to a different OS thread at any point,</div><div>you can put a memory barrier primitive (like "mfence" instruction [1][2][3])</div><div>at each target points before or after each ffi calls.</div><div><br></div><div>Of course, it's expensive if you put for each ffi calls.</div><div>And you should abstract from cpu hardware.</div><div>(I found useful explicit memory barrier api[4].)</div><div><br></div><div><br></div><div>I feel that the _exact_ memory barrier on out-of-order cpu,</div><div>multi core, memory mapped IO, ... is very expensive.</div><div>It's only satisfy by explicit "hardware memory barrier mechanism".</div><div><br></div><div>And it's difficult that exact memory barrier satisfy all case</div><div>by the combination of some implicit mechanism.</div><div><br></div><div><br></div><div>BTW, does it truly need memory barrier?</div><div>Also C language, exact memory barrier is expensive.</div><div><br></div><div><br></div><div>And, Maybe, ghc-devs are very busy to ship ghc7.10.2 :-)</div><div><br></div><div><br></div><div>[1]: Chapter 8.2, <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf" target="_blank">http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf</a></div><div>[2]: MFENCE, <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf" target="_blank">http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf</a></div><div>[3]: Chapter 7.5.5, <a href="http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf" target="_blank">http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf</a></div><div>[4]: <a href="https://hackage.haskell.org/package/atomic-primops" target="_blank">https://hackage.haskell.org/package/atomic-primops</a></div><div><br></div><div><br></div><div>Cheers,</div><div>Takenobu</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-06-02 22:26 GMT+09:00 David Turner <span dir="ltr"><<a href="mailto:dct25-561bs@mythic-beasts.com" target="_blank">dct25-561bs@mythic-beasts.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Takenobu,<br>

<br>

My question is more about consecutive FFI calls on the same Haskell<br>

thread, of which there are I suppose 8 cases in your model: the thread<br>

is {unbound,bound}, the first call is {safe,unsafe} and the second is<br>

{safe,unsafe}. If the thread is bound, there's no problem as the two<br>

calls happen on the same OS thread. No memory barriers are needed. If<br>

the thread is unbound, the two calls may occur on distinct OS threads.<br>

Although the first call must have returned before the second is made,<br>

it doesn't immediately follow that there has been a memory barrier in<br>

between. I'm not sure it matters whether either call is safe or<br>

unsafe. As a Haskell thread can migrate to a different OS thread at<br>

any point, I don't think it's possible to put appropriate memory<br>

barriers in the source.<br>

<br>

I've been looking at the GHC source and commentary and believe the<br>

answer is 'yes', but can anyone from ghc-dev comment on the following?<br>

<br>

If a Haskell thread moves to a different OS thread then<br>

yieldCapability() will at some point be called. This function normally<br>

calls ACQUIRE_LOCK, which is either pthread_mutex_lock() or<br>

EnterCriticalSection() in the threaded runtime (on Linux and Win32<br>

respectively). It looks like both of these count as full memory<br>

barriers. I think in the (rare) case where yieldCapability() only does<br>

a GC and then exits, the fact that it's always called in a loop means<br>

that eventually *some* Task or other emits a memory barrier.<br>

<br>

Thanks in advance,<br>

<br>

David<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

<br>

<br>

<br>

<br>

<br>

<br>

On 30 May 2015 at 04:10, Takenobu Tani <<a href="mailto:takenobu.hs@gmail.com">takenobu.hs@gmail.com</a>> wrote:<br>

> Hi David,<br>

><br>

> I'm not 100% sure, especially semantics,  and I'm studying too.<br>

> I don't have an answer, but I describe the related matters in order to<br>

> organize my head.<br>

><br>

> At first:<br>

>   "memory barrier" ... is order control mechanism between memory accesses.<br>

>   "bound thread"   ... is association mechanism between ffi calls and a<br>

> specified thread.<br>

><br>

> And:<br>

>   "memory barrier"  ... is depend on cpu hardware architecture(x86, ARM,<br>

> ...).<br>

>   "OS level thread" ... is depend on OS(Linux, Windows, ...).<br>

><br>

> Last:<br>

> There are four cases about ffi call [1]:<br>

>   (1) safe ffi call   on unbound thread(forkIO)<br>

>   (2) unsafe ffi call on unbound thread(forkIO)<br>

>   (3) safe ffi call   on bound thread(main, forkOS)<br>

>   (4) unsafe ffi call on bound thread(main, forkOS)<br>

><br>

> I think, maybe (2) and (4) have not guarantee with memory ordering.<br>

> Because they might be inlined and optimized.<br>

><br>

> If (1) and (3) always use pthread api (or memory barrier api) for thread/HEC<br>

> context switch,<br>

> they are guarantee.<br>

> But I think that it would not guarantee the full case.<br>

><br>

><br>

> I feel that order issues are very difficult.<br>

> I think order issues can be safely solved by explicit notation,<br>

> like explicit memory barrier notation, STM,...<br>

><br>

><br>

> If I have misunderstood, please teach me :-)<br>

><br>

><br>

> [1]:<br>

> <a href="http://takenobu-hs.github.io/downloads/haskell_ghc_illustrated.pdf#page=98" target="_blank">http://takenobu-hs.github.io/downloads/haskell_ghc_illustrated.pdf#page=98</a><br>

><br>

> Cheers,<br>

> Takenobu<br>

><br>

><br>

><br>

> 2015-05-29 1:24 GMT+09:00 David Turner <<a href="mailto:dct25-561bs@mythic-beasts.com">dct25-561bs@mythic-beasts.com</a>>:<br>

>><br>

>> Hi,<br>

>><br>

>> If I make a sequence of FFI calls (on a single Haskell thread) but<br>

>> which end up being called from different OS threads, is there any kind<br>

>> of ordering guarantee given? More specifically, is there a full memory<br>

>> barrier at the point where a Haskell thread migrates to a new OS<br>

>> thread?<br>

>><br>

>> Many thanks,<br>

>><br>

>> David<br>

>> _______________________________________________<br>

>> Haskell-Cafe mailing list<br>

>> <a href="mailto:Haskell-Cafe@haskell.org">Haskell-Cafe@haskell.org</a><br>

>> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br>

><br>

><br>

</div></div></blockquote></div><br></div>