FFI-free NaN checks? (isDoubleNan and friends)

Tue Mar 6 15:30:06 UTC 2018

Hi,

You can try with foreign primops, it should be faster than the FFI:

In IsDoubleNanPrim.s:

.global isDoubleNan_prim
isDoubleNan_prim:
    xor %rbx,%rbx
    ucomisd %xmm1, %xmm1
    lahf
    testb $68, %ah
    jnp .Lout
    mov $1, %rbx
.Lout:
    jmp * (%rbp)

In IsDoubleNan.hs:

{-# LANGUAGE GHCForeignImportPrim #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnliftedFFITypes #-}

module Main where

import GHC.Base

foreign import prim "isDoubleNan_prim" isDoubleNan_prim :: Double# -> Int#

isDoubleNan :: Double -> Bool
isDoubleNan (D# d#) = case isDoubleNan_prim d# of
    0# -> False
    _  -> True

main :: IO ()
main = do
    let testNaN x = putStrLn $ "Testing " ++ show x ++ ": " ++ show 
(isDoubleNan x)
    testNaN 10.3
    testNaN (0/0)

Compile with: ghc -Wall -O IsDoubleNan.hs IsDoubleNanPrim.s

I haven't benchmarked this but I would be interested to see the 
comparison with the other versions on your benchmarks!

Cheers,
Sylvain

On 05/03/2018 22:53, Mateusz Kowalczyk wrote:
> Hi,
>
> Recently at a client I was profiling some code and isDoubleNaN lit up.
> We were checking a lot of doubles for NaN as that's what customer would
> send in.
>
> I went to investigate and I found that FFI is used to achieve this. I
> was always under the impression that FFI costs a little. I had at the
> time replaced the code with a hack with great results:
>
> ```
> isNaN' :: Double -> Bool
> isNaN' d = d /= d
> ```
>
> While this worked and provided good speedup in my case, this fails
> catastrophically if the program is compiled with -ffast-math. This is
> expected. I have since reverted it. Seeking an alternative solution I
> have thought about re-implementing the C code with a native Haskell
> version: after all it just checks a few bits. Apparently unsafeCoerce#
> and friends were a big no-no but I found
> https://phabricator.haskell.org/D3358 . I have implemented the code at
> the bottom of this post. Obviously it's missing endianness (compile-time
> switch).
>
> This seems to be faster for smaller `mkInput` list than Prelude.isNaN
> but slower slightly on the one below. The `/=` version is the fastest
> but very fragile.
>
> My question to you all is whether implementing a version of this
> function in Haskell makes sense and if not, why not? The
> stgDoubleToWord64 is implemented in CMM and I don't know anything about
> the costs of that.
>
> * Is there a cheaper alternative to FFI way?
> * If yes, does anyone know how to write it such that it compiles to same
> code but without the call overhead? I must have failed below as it's
> slower on some inputs.
>
> Basically if a faster way exists for isNaN, something I have to do a
> lot, I'd love to hear about it.
>
> I leave you with basic code I managed to come up with. 8.4.x only.
>
>
> ```
> {-# LANGUAGE MagicHash    #-}
> {-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm #-}
> module Main (main) where
>
> import GHC.Float
> import GHC.Prim
>
> isNaN' :: Double -> Bool
> isNaN' d = d /= d
>
> isNaNBits :: Double -> Bool
> isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of
>    1# -> case bits `and#` mantissaMask of
>      0## -> False
>      _ -> True
>    _ -> False
>    where
>      bits :: Word#
>      bits = stgDoubleToWord64 d
>
>      expMask, mantissaMask :: Word#
>      expMask = 0x7FF0000000000000##
>      mantissaMask = 0x000FFFFFFFFFFFFF##
>
> main :: IO ()
> main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000)
> `seq` pure ()
>    where
>      nan :: Double
>      nan = log (-1)
>
>      mkInput :: Int -> [Double]
>      mkInput n = take n $ cycle [1, nan]
>
>      sumFilter :: (Double -> Bool) -> [Double] -> Double
>      sumFilter p = Prelude.sum . Prelude.filter (not . p)
> ```
>