[Haskell-cafe] Re: Haskell Speed

Jan-Willem Maessen jmaessen at alum.mit.edu
Thu Dec 29 20:23:06 EST 2005


On Dec 29, 2005, at 7:44 PM, Branimir Maksimovic wrote:
> To comment some observation on this program.
> Most of the pressure now is on Data.HashTable.
> I've susspected such large memory usage on substring from array  
> conversions,
> so mad version with data MyString = MakeMyStrinf { buf :: Ptr Char,  
> size :: Int }
> and there was no substrings in map or anywhere else, but memory
> consumption remains.
> So after eliminating inserts and updates into HashTable memory was ok.
> Finally I've realized that updates into hash table actually increase
> memory usage to large extent instead to keep memory same
> on average. So I guess this is bug in HashTable?

Probably.  The minimum table chunk size was rather large.  I have  
been experimenting (tests are running even as I type) with alternate  
implementations of Data.HashTable.  So far the winning implementation  
is one based on multiplicative hashing and simple table doubling.   
This seems to be consistently competitive with / faster than  
Data.Map.  At this point my inclination is to make the whole suite  
available:

Data.HashTable.Class
Data.HashTable.DataMap
Data.HashTable.Doubling
Data.HashTable.Flat
Data.HashTable.Freeze
Data.HashTable.Modulus
Data.HashTable.Multiplicative
Data.HashTable.Original
Data.HashTable.Test
Data.HashTable

I've separated out the hashing technique (Modulus, Multiplicative)  
from the hash table implementation.  Note that a few of the above are  
bogus, and this is only a fraction of the implementations tried thus  
far.

I need to get distribution clearance for a bunch of this code from my  
employer, and figure out how to package it.  The latter may be  
tricky, as Data.Hashtable is currently rather intimately tied to a  
bunch of rather tricky bits of GHC.

-Jan-Willem Maessen

> Second in this test, hash function needs to be very strong,
> as even with following I got longest chain of 16 elements.
> myHashString = fromIntegral . ff''' . ff'' . ff' . foldr f 0
>      where f c m = f'' $ f' (ord c + m)
>            f' m = m + (m `shiftL` 10)
>            f'' m = m `xor` (m `shiftR` 6)
>            ff' m = m + (m `shiftL` 3)
>            ff'' m = m `xor` (m `shiftR` 11)
>            ff''' m = m + (m `shiftL` 15)
> Default hashString has longestChain of 18 elements.
> Perhaps someone can explain if such a memory leaks from HashTable  
> updates
> are normal or are bugs?
> All in all functional version consumes less memory and is twice as
> fast.
>
> Greetings, Bane.
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today it's  
> FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list