[Haskell-cafe] Re: Haskell Speed
Jan-Willem Maessen
jmaessen at alum.mit.edu
Thu Dec 29 20:23:06 EST 2005
On Dec 29, 2005, at 7:44 PM, Branimir Maksimovic wrote:
> To comment some observation on this program.
> Most of the pressure now is on Data.HashTable.
> I've susspected such large memory usage on substring from array
> conversions,
> so mad version with data MyString = MakeMyStrinf { buf :: Ptr Char,
> size :: Int }
> and there was no substrings in map or anywhere else, but memory
> consumption remains.
> So after eliminating inserts and updates into HashTable memory was ok.
> Finally I've realized that updates into hash table actually increase
> memory usage to large extent instead to keep memory same
> on average. So I guess this is bug in HashTable?
Probably. The minimum table chunk size was rather large. I have
been experimenting (tests are running even as I type) with alternate
implementations of Data.HashTable. So far the winning implementation
is one based on multiplicative hashing and simple table doubling.
This seems to be consistently competitive with / faster than
Data.Map. At this point my inclination is to make the whole suite
available:
Data.HashTable.Class
Data.HashTable.DataMap
Data.HashTable.Doubling
Data.HashTable.Flat
Data.HashTable.Freeze
Data.HashTable.Modulus
Data.HashTable.Multiplicative
Data.HashTable.Original
Data.HashTable.Test
Data.HashTable
I've separated out the hashing technique (Modulus, Multiplicative)
from the hash table implementation. Note that a few of the above are
bogus, and this is only a fraction of the implementations tried thus
far.
I need to get distribution clearance for a bunch of this code from my
employer, and figure out how to package it. The latter may be
tricky, as Data.Hashtable is currently rather intimately tied to a
bunch of rather tricky bits of GHC.
-Jan-Willem Maessen
> Second in this test, hash function needs to be very strong,
> as even with following I got longest chain of 16 elements.
> myHashString = fromIntegral . ff''' . ff'' . ff' . foldr f 0
> where f c m = f'' $ f' (ord c + m)
> f' m = m + (m `shiftL` 10)
> f'' m = m `xor` (m `shiftR` 6)
> ff' m = m + (m `shiftL` 3)
> ff'' m = m `xor` (m `shiftR` 11)
> ff''' m = m + (m `shiftL` 15)
> Default hashString has longestChain of 18 elements.
> Perhaps someone can explain if such a memory leaks from HashTable
> updates
> are normal or are bugs?
> All in all functional version consumes less memory and is twice as
> fast.
>
> Greetings, Bane.
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today it's
> FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list