Add Ord Laws to next Haskell Report

Carter Schonwald carter.schonwald at gmail.com
Thu Feb 7 21:27:53 UTC 2019


@sven and @henning :
i'm actually doing some preliminary work to add save and restore for FPU
state to the GHC RTS, at the green/haskell thread layer. after first
ripping out x87 code gen, which just needs some more docs written out
before its merged in. note that i'm speaking specifically of the MXCSR
register save and restore, not the more hefty operations you might be
thinking.

FPU mode state save and restore is done already on EVERY OS when switching
threads/processes, and in the agner fog latency tables  the cost of
manipulating  mxcsr registers is pretty small!
https://www.agner.org/optimize/instruction_tables.pdf

LDMXCSR  (restore) and STMXCSR  (save) have cpu latencies at like 5-20
cycles  (more often 8-15), so having the current C ffi calls set the
default C FPU environment (as we currently have ordinarily) is super doable
to ensure no breakage of existing C bindings, plus have a new ccall variant
that inherits the host haskell thread FPU state.  we're talking sub 10
nanosecond overhead on x86 and x86_64 platforms (and either way, on those
platforms soon ghc will only be using the sse2 or higher ).

point being: aside from like AMD piledriver micro architecture and some
stuff from VIA, the performance of the CPU instruction for the signalling
nans state setup and related rounding mode etc, should work perfectly well,

@Daniel Cartwright <chessai1996 at gmail.com>  I do not support documenting
false laws in any enshrined way, it will result in broken code. (Also i'm
actually working to do some fixes, if you reread my remarks and merijn's,
and i think we can have our cake and eat it, with the finest floats). Lets
fix stuff and then document true laws!



On Thu, Feb 7, 2019 at 12:05 PM Sven Panne <svenpanne at gmail.com> wrote:

> Am Do., 7. Feb. 2019 um 17:22 Uhr schrieb Henning Thielemann <
> lemming at henning-thielemann.de>:
>
>> [...] What about calling into foreign code? If I call a BLAS routine and
>> one
>> element of the result vector is NaN, shall this be trapped? Or shall it
>> be
>> trapped once I access the NaN element?
>>
>
> IMHO this is the biggest show stopper for some exotic NaN handling, as
> correct as it may be mathematically or aesthetically: The floating point
> environment is a thread-local (i.e. basically global) entity on most
> platforms, and most programming language runtimes expect a "default"
> environment, i.e. no traps when NaNs are encountered. So if Haskell wants
> to do things differently, the FPE has to be set/reset around foreign calls
> and for around every Haskell callback. I am not sure if this is really
> worth the trouble and the performance loss. For some special applications
> it might be OK or even important, but my gut feeling is that trapping NaNs
> is the wrong default in our current world...
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20190207/6b07478c/attachment.html>


More information about the Libraries mailing list