[Haskell-cafe] Re: Very imperfect hash function

Thu Jan 28 14:37:42 EST 2010

On Thu, 2010-01-28 at 14:07 -0500, Steve Schafer wrote:
> I'm looking for some algorithmic suggestions:
> 
> I have a set of several hundred key/value pairs. The keys are 32-bit
> integers, and are all distinct. The values are also integers, but the
> number of values is small (only six in my current problem). So,
> obviously, several keys map to the same value.
> 
> For some subsets of keys, examining only a small portion of the key's
> bits is enough to determine the associated value. For example, there may
> be 250 keys that all have the same most-significant byte, and all 250
> map to the same value. There are also keys at the other extreme, where
> two keys that differ in only one bit position map to different values.
> 
> The data are currently in a large lookup table. To save space, I'd like
> to convert that into a sort of hash function:
> 
>  hash :: key -> value
> 
> My question is this: Is there any kind of generic approach that can make
> use of the knowledge about the internal redundancy of the keys to come
> up with an efficient function?
> 
> Steve Schafer
> Fenestra Technologies Corp.
> http://www.fenestra.com/

Maybe:

data TTree a = TTree Int (TTree a) (TTree a)
             | TNode a
--              | THashNode <some hash table>

hash :: TTree a -> Int32 -> a
hash (TNode v)     _ = v
hash (TTree b l r) k = if testBit k b then hash r k else hash l k
-- hash (THashNode h) k = lookupHashTable h k

Of course you need to code efficiently the tree.

Regards