[Haskell-beginners] Red-black tree performance

Lorenzo Bolla lbolla at gmail.com
Wed Mar 21 11:02:49 CET 2012


On Wed, Mar 21, 2012 at 9:27 AM, Heinrich Apfelmus <
apfelmus at quantentunnel.de> wrote:

> Adrien Haxaire wrote:
>
>> I tried with foldl'. I modified the code at several places to match
>> the argument pattern, and now I see why flip is so useful :) The
>> conclusion is also interesting: the productivity climbs up to 92%,
>> while the calculation time raises to 6.3s. I guess that the choice is
>> space or time, as often.
>>
>
> 92% productivity seems right for me. In contrast, 20% garbage collection
> may be a sign that something went wrong.

I think that this is likely due to laziness: in the very end, you only
> query the rightmost element. After a while, the program simply won't
> evaluate the balancing on the left side of the tree, as you're not asking
> it to evaluate anything there.
>
> So, you're not necessarily comparing apples and apples here. But on the
> other hand, maybe that's a performance disadvantage of the C++ version. In
> Haskell, performance depends a lot on usage patterns.
>
>
This is very true.
In fact, after some tweaking, I found that the best solution is using
foldl', lazy type and force some strictness in "insert" using "seq". See
below:

import Data.Foldable (foldl', foldr')

data Color = Red | Black deriving (Show)

data Tree a = Empty | Node Color (Tree a) a (Tree a)
              deriving (Show)

insert :: Ord a => a -> Tree a -> Tree a
insert x t = makeBlack (ins t)
             where
               ins Empty = Node Red Empty x Empty
               --  ins (Node color a y b) | x < y  = ins a `seq` balance
color (ins a) y b
               --                         | x == y = Node color a y b
               --                         | x > y  = ins b `seq` balance
color a y (ins b)
               ins (Node color a y b) | x < y  = balance color (ins a) y b
                                      | x == y = Node color a y b
                                      | x > y  = balance color a y (ins b)

makeBlack :: Tree a -> Tree a
makeBlack (Node _ a y b) = Node Black a y b
makeBlack Empty = Empty

balance :: Color -> Tree a -> a -> Tree a -> Tree a
balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a
x b) y (Node Black c z d)
balance color a x b = Node color a x b

maxTree :: Ord a => Tree a -> a
maxTree (Node _ Empty n Empty) = n
maxTree (Node _ _ _ t) = maxTree t

toInsert :: [Int]
--  toInsert = [1..1000000]
toInsert = map (`mod` 100) [1..10000000]

main :: IO ()
main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert


Note that if the improvement is around 10% for "toInsert" being a monotonic
sequence of integers, the improvement is much bigger (>2x for me) for a
more "random" "toInsert" sequence.

L.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120321/4d3be6c9/attachment.htm>


More information about the Beginners mailing list