Use Radix for FiniteMap? (was Re: [Haskell] performance tuning Data.FiniteMap)

S. Alexander Jacobson alex at i2x.com
Tue Feb 24 21:40:16 EST 2004


[Rewrote prior code to be cleaner]
Isn't the following more efficient than Data.FiniteMap?

   class Ix a=>Radix a where maxRange::(a,a)

   class Radix a => HashKey b a where hashKey::b->[a]
   instance Radix Char where maxRange=(chr 0,chr 255)
   instance Radix a=> HashKey [a] a where hashKey x=x

   data HT radix elt = HT (Maybe (Array radix (HT radix elt))) (Maybe elt)
   emptyHT=HT Nothing Nothing
   emptyArray = Just (array maxRange [(x,emptyHT) | x<- [(fst maxRange)..(snd maxRange)]])

   hLookup table key = hLookup' table (hashKey key)
   hLookup' (HT x y) [] = y
   hLookup' (HT Nothing _) _ = Nothing
   hLookup' (HT (Just ar) _) (k:ey) = hLookup' (ar!k) ey

   --insert table key val = insert' table (hashKey key) val
   insert' (HT x _) [] val = HT x val
   insert' (HT Nothing y) key val = insert' (HT emptyArray y) key val
   insert' (HT (Just ar) y) (k:ey) val = HT (Just $ ar//[(k,insert' (ar!k) ey val)]) y

Isn't hLookup substantially faster than the
binarySearch in FiniteMap for e.g. Strings?

Doesn't insert compete with FiniteMap because
small array copies should be blisteringly fast?

Also, basic Haskell questions:
* How do I get insert to typecheck? insert' works fine.
* How do I hide the "lookup" automatically imported from GHC.List?

-Alex-

_________________________________________________________________
S. Alexander Jacobson                  mailto:me at alexjacobson.com
tel:917-770-6565                       http://alexjacobson.com



> On Tue, 24 Feb 2004, JP Bernardy wrote:
>
> >
> > > I believe FiniteMap works by representing the data
> > > in binary trees.  It is therefore O(log2(n)) to
> > > read and update.
> > >
> > > However, if my application reads many more times
> > > than it writes, then perhaps I can get a
> > > substantial performance boost by increasing the
> > > branch factor on the tree.  For example, if each
> > > node was an array of 256 elements, reads would be
> > > O(log256(n)), a 128x improvement!
> >
> > Not quite.
> >
> > In fact, O(log256(n)) is equivalent to O(log2(n)),
> > because there is only a constant factor between the
> > two. That's why basis of logarithms are usually
> > omitted in O() expressions.
> >
> > Besides, the ratio between log256(n) and log2(n) is
> > more like 8 than 128. (And you'd loose this factor
> > in searching the right subtree, as Ketil pointed out)
> >
> > Tuning Data.FiniteMap probably is not what you want.
> >
> > I don't know, but you can have a look at
> > Data.Hashtable.
> >
> > Just my 2 cents,
> > JP.
> >
> >
> >
> > __________________________________
> > Do you Yahoo!?
> > Yahoo! Mail SpamGuard - Read only the mail you want.
> > http://antispam.yahoo.com/tools
> > _______________________________________________
> > Haskell mailing list
> > Haskell at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell
> >
>
>



More information about the Haskell mailing list