[Haskell-cafe] Re: Hashtable woes

Chris Kuklewicz haskell at list.mightyreason.com
Mon Jan 23 10:09:15 EST 2006


Simon Marlow wrote:
> Bulat Ziganshin wrote:
>> Hello Chris,
>>
>> Monday, January 23, 2006, 12:27:53 PM, you wrote:
>>
>> CK> The only mutable data structure that come with GHC besides arrays is
>> CK> Data.Hashtable, which is not comptetitive with OCaml Hashtbl or DMD's
>> CK> associative arrays (unless there is something great hidden under
>> CK> Data.Graph). Is there any hope for GHC 6.6? Does anyone have
>> pointers to
>> CK> an existing library at all? Perl and Python and Lua also have
>> excellent
>> CK> built in hashtable capabilities. Where is a good library for Haskell?
>>
>> 1) are you used "+RTS -A10m" / "+RTS -H100m"?
>>
>> 2) Simon Marlow optimized something in the IOArray handling, but i
>> don't understand that is changed. see
>> http://cvs.haskell.org/trac/ghc/ticket/650
> 
> Much of the GC overhead of Data.Hash should be gone in GHC 6.6.  I also
> have an improved implementation of Data.Hash from Jan-Willem Maessen to
> import, but that will be 6.6 rather than 6.4.2.
> 
> Cheers,
>     Simon

That is good to hear.  The benchmark's tests take 1,250,000 pre-generated
strings as the keys.  At the end, the string keys are 18 characters long, drawn
randomly from a set of 4 characters.  So the hash computations are a nontrivial hit.

Using -A400m I get 39s down from 55s.  That is the best Data.HashTable time I
have seen. (Using -A10m and -A100m were a little slower).

Using my over optimized c-code hashtable I get 12.3 seconds.  The associative
arrays in OCaml and D are still faster.  So you see why I long for GHC 6.6.

Is Jan-Willem Maessen's Hash available anywhere?  I could benchmark it.


More information about the Haskell-Cafe mailing list