[Haskell-cafe] Re: Haskell Speed
Jan-Willem Maessen
jmaessen at alum.mit.edu
Wed Jan 4 09:47:24 EST 2006
On Jan 4, 2006, at 5:30 AM, Simon Marlow wrote:
> On 30 December 2005 01:23, Jan-Willem Maessen wrote:
>
>> Probably. The minimum table chunk size was rather large. I have
>> been experimenting (tests are running even as I type) with alternate
>> implementations of Data.HashTable. So far the winning implementation
>> is one based on multiplicative hashing and simple table doubling.
>> This seems to be consistently competitive with / faster than
>> Data.Map. At this point my inclination is to make the whole suite
>> available:
>>
>> Data.HashTable.Class
>> Data.HashTable.DataMap
>> Data.HashTable.Doubling
>> Data.HashTable.Flat
>> Data.HashTable.Freeze
>> Data.HashTable.Modulus
>> Data.HashTable.Multiplicative
>> Data.HashTable.Original
>> Data.HashTable.Test
>> Data.HashTable
>>
>> I've separated out the hashing technique (Modulus, Multiplicative)
>> from the hash table implementation. Note that a few of the above are
>> bogus, and this is only a fraction of the implementations tried thus
>> far.
>>
>> I need to get distribution clearance for a bunch of this code from my
>> employer, and figure out how to package it. The latter may be
>> tricky, as Data.Hashtable is currently rather intimately tied to a
>> bunch of rather tricky bits of GHC.
>
> I wonder if you could put together a drop-in replacement for
> Data.HashTable that we can incorporate? There's not much point in us
> still providing the inefficient version as standard after you've done
> all this work to figure out how to do it better.
That'd be good---though what qualifies as the "efficient version"
will, I think, change based on the GC changes you said you made (I
haven't had the time/patience to try to bootstrap the development
version of GHC). This is one of the reasons I've been saving so many
bread crumbs along the way. All of the above will work as drop-in
Data.HashTable replacements, with Data.HashTable simply re-exporting
multiplicative hashing and table doubling.
The tricky bit, as you probably know, is the rather intimate ties
between Data.HashTable and the rest of the builtin code. This
requires the shipping Data.HashTable library to import most things
from the source, rather than from the place where they're made
publicly available. Do you have any suggestions as to how I might go
about testing that integration?
-Jan
>
> Cheers,
> Simon
More information about the Haskell-Cafe
mailing list