[Haskell-cafe] String Hashing

Thomas Conway drtomc at gmail.com
Sun Jun 17 21:55:05 EDT 2007


Hi All,

I'm trying to figure out how to maximum performance out of one of my
inner loops which involves string hashing.

Consider the following hash function, which is a transliteration of a
good one written in C:

--8x--8x--8x--8x--8x--8x--8x--8x--8x
module HashStr where

import Data.Bits
import Data.ByteString as BLOB
import Data.Word

data Triple = Triple !Word64 !Word64 !Word64

hashStr :: ByteString -> Word64
hashStr str = hashBlock (Triple gold gold gold) str
    where
    gold = 0x9e3779b97f4a7c13

hashBlock (Triple a b c) str
    | BLOB.length str > 0 = hashBlock (mix (Triple a' b' c')) rest
    | otherwise           = c
    where
    a' = a + BLOB.foldl' make 0 (slice 0)
    b' = b + BLOB.foldl' make 0 (slice 1)
    c' = c + BLOB.foldl' make 0 (slice 2)
    make x w = (x `shiftL` 8) + fromIntegral w
    slice n = BLOB.take 8 $ BLOB.drop (8 * n) str
    rest = BLOB.drop 24 str

    mix :: Triple -> Triple
    mix = (\(Triple a b c) -> Triple (a - c) b c) .
          (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 43)) b c) .
          (\(Triple a b c) -> Triple a (b - c) c) .
          (\(Triple a b c) -> Triple a (b - a) c) .
          (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 9)) c) .
          (\(Triple a b c) -> Triple a b (c - a)) .
          (\(Triple a b c) -> Triple a b (c - b)) .
          (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 8))) .
          (\(Triple a b c) -> Triple (a - b) b c) .
          (\(Triple a b c) -> Triple (a - c) b c) .
          (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 38)) b c) .
          (\(Triple a b c) -> Triple a (b - c) c) .
          (\(Triple a b c) -> Triple a (b - a) c) .
          (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 23)) c) .
          (\(Triple a b c) -> Triple a b (c - a)) .
          (\(Triple a b c) -> Triple a b (c - b)) .
          (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 5))) .
          (\(Triple a b c) -> Triple (a - b) b c) .
          (\(Triple a b c) -> Triple (a - c) b c) .
          (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 35)) b c) .
          (\(Triple a b c) -> Triple a (b - c) c) .
          (\(Triple a b c) -> Triple a (b - a) c) .
          (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 49)) c) .
          (\(Triple a b c) -> Triple a b (c - a)) .
          (\(Triple a b c) -> Triple a b (c - b)) .
          (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 11))) .
          (\(Triple a b c) -> Triple (a - b) b c) .
          (\(Triple a b c) -> Triple (a - c) b c) .
          (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 12)) b c) .
          (\(Triple a b c) -> Triple a (b - c) c) .
          (\(Triple a b c) -> Triple a (b - a) c) .
          (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 18)) c) .
          (\(Triple a b c) -> Triple a b (c - a)) .
          (\(Triple a b c) -> Triple a b (c - b)) .
          (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 22)))

--8x--8x--8x--8x--8x--8x--8x--8x--8x

Obviously, we'd like all those lambdas and composes to be inlined,
along with all the intermediate Triple structures. So, how do you
convince ghc to do this? Alternatively, how would you *translate*
rather than transliterate, the mix function?

-- 
Dr Thomas Conway
drtomc at gmail.com

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.


More information about the Haskell-Cafe mailing list