Minor inefficiency in Data.Map and Data.Set?

Tyson Whitehead twhitehead at gmail.com
Mon Dec 21 00:38:25 EST 2009


I was looking at the Data.Map and Data.Set implementations and I think there 
is a slight inefficiency.  Consider the Data.Map code

{--------------------------------------------------------------------
  [glue l r]: glues two trees together.
  Assumes that [l] and [r] are already balanced with respect to each other.
--------------------------------------------------------------------}
glue :: Map k a -> Map k a -> Map k a
glue Tip r = r
glue l Tip = l
glue l r   
  | size l > size r = let ((km,m),l') = deleteFindMax l in balance km m l' r
  | otherwise       = let ((km,m),r') = deleteFindMin r in balance km m l r'

The call to balance should not be needed as subtracting one item from the non-
smaller of them cannot thrown them out of balance given they are already in 
balance with one another.  The only way they could be made out of balance 
would be to subtract an item for smaller one.  The replacement code is

glue l r   
  | sizeL > sizeR = let ((km,m),l') = deleteFindMax l in Bin sizeX km m l' r
  | otherwise     = let ((km,m),r') = deleteFindMin r in Bin sizeX km m l r'
  where
    sizeL = size l
    sizeR = size r
    sizeX = sizeL + sizeR

The Data.Set code is identical except for the key.

Cheers!  -Tyson


More information about the Libraries mailing list