[Haskell-cafe] Pure hashtable library
jmaessen at alum.mit.edu
Thu Aug 28 09:44:11 EDT 2008
On Aug 27, 2008, at 4:31 PM, Bulat Ziganshin wrote:
> Hello Jan-Willem,
> Wednesday, August 27, 2008, 4:06:11 PM, you wrote:
>> One obvious way to make non-modifiable hash tables useful is to "eat
>> your own tail" non-strictly--- aggregate a set of hash table entries,
>> construct a hash table from them, and plumb the resulting hash table
>> into the original computation by tying the knot. This works really
>> well if you can construct the bucket lists lazily and if you specify
>> the table size up front.
> i think it's impossible since you need to scan whole assoclist to
> build list of entries for any value of hash function.
I think this is because I wasn't quite clear enough about the problem
I was solving. I think you'll agree that we can use an assocList non-
strictly in the following sense:
* We can do lookups that succeed so long as we can compute all keys
up to and including the key that matches.
* We never do non-strict lookups that fail, as that would require
examining the entire assocList.
Now I can build a hashtable with the same property from an assocList,
albeit very inefficiently, using code like this (untested):
lazyArray :: (Ix i) => (i,i) -> [(i,v)] -> Array i [v]
lazyArray bnds kvs =
array bnds [ (k, map snd . filter ((k==) . fst) $ kvs) | k <-
range bnds ]
makeHash :: (Eq k, Hashable k) => [(k,v)] -> Array Int [(k,v)]
makeHash assocList =
lazyArray (0,n-1) labeledAssocList
where labeledAssocList = [ (hashToSize n k, (k,v)) | (k,v) <-
We label each assocList element with its corresponding hash bucket
(labeledAssocList); each bucket then contains exactly the elements of
assocList that map to that bucket, in exactly the order in which they
occurred in assocList.
The LazyArray library in hbc essentially did exactly what the
lazyArray function I've shown above does, only the input list is
traversed once rather than having a separate traversal for each
bucket. This can actually be implemented using the ST monad augmented
by unsafeFreezeSTArray and unsafeInterleaveST; indeed the "State in
Haskell" paper by Peyton Jones and Launchbury gives the implementation
of a very similar function.
I have code for LazyArray based on the above paper that works with
GHC, but I haven't needed it in a while.
More information about the Haskell-Cafe