Data.HashTable
Jan-Willem Maessen
jmaessen at alum.mit.edu
Thu Mar 6 11:12:46 EST 2008
On Mar 6, 2008, at 9:28 AM, Johannes Waldmann wrote:
> Hello.
>
> When I insert 50 values in an empty hashtable,
> I get longest chains of length 10. Like this:
>
> import Data.HashTable ( HashTable )
> import qualified Data.HashTable as H
> import System.Random
>
> main = do
> t <- H.new ( == ) H.hashInt
> sequence $ replicate 50 $ do
> x <- randomRIO ( 0 , 1000 )
> H.update t x ()
> lc <- H.longestChain t
> print lc
>
> output:
> [(831,()),(346,()),(773,()),(475,()),(812,()),(307,()),(113,()),(637,
> ()),(100,())]
The Data.HashTable code was tuned to favor relatively high bucket
loads. Is that a good idea (I ask, having done some of this tuning
myself)? It's been a long time since the code was re-tuned, and it's
likely some of the decisions ought to change in light of changes in
GHC's compiler, GC, and so forth. At the moment:
--
-----------------------------------------------------------------------------
-- Parameters
tABLE_MAX :: Int32
tABLE_MAX = 32 * 1024 * 1024 -- Maximum size of hash table
tABLE_MIN :: Int32
tABLE_MIN = 8
hLOAD :: Int32
hLOAD = 7 -- Maximum average load of a single
hash bucket
hYSTERESIS :: Int32
hYSTERESIS = 64 -- entries to ignore in load computation
{- Hysteresis favors long association-list-like behavior for small
tables. -}
It's likely modern versions of GHC would fare better with smaller
settings of tABLE_MIN, hLOAD, and hYSTERESIS. I seem to recall that
doubling the table used to be rather hard on the GC.
-Jan
>
>
> This is not what I expected (quite the contrary).
> Did I miss something here? The access time is linear in
> the length of the longest chain?
> Perhaps the initial table (produced by "new") is too small,
> so the table is always overloaded?
>
>
> Looking at
> http://java.sun.com/javase/6/docs/api/java/util/HashMap.html
> they start out with table size 16 by default
> *and* (more importantly) they have a constructor that allows
> to set the initial table size and the load factor.
>
>
> Best regards, Johannes Waldmann.
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
More information about the Libraries
mailing list