[Haskell-beginners] Red-black tree performance
Lorenzo Bolla
lbolla at gmail.com
Wed Mar 21 11:08:06 CET 2012
Ops, wrong copy and paste!
See correct script below:
import Data.Foldable (foldl')
data Color = Red | Black deriving (Show)
data Tree a = Empty | Node Color (Tree a) a (Tree a)
deriving (Show)
insert :: Ord a => a -> Tree a -> Tree a
insert x t = makeBlack (ins t)
where
ins Empty = Node Red Empty x Empty
ins (Node color a y b) | x < y = ins a `seq` balance color
(ins a) y b
| x == y = Node color a y b
| x > y = ins b `seq` balance color
a y (ins b)
makeBlack :: Tree a -> Tree a
makeBlack (Node _ a y b) = Node Black a y b
makeBlack Empty = Empty
balance :: Color -> Tree a -> a -> Tree a -> Tree a
balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a
x b) y (Node Black c z d)
balance color a x b = Node color a x b
maxTree :: Ord a => Tree a -> a
maxTree (Node _ Empty n Empty) = n
maxTree (Node _ _ _ t) = maxTree t
toInsert :: [Int]
-- toInsert = [1..1000000]
toInsert = map (`mod` 100) [1..10000000]
main :: IO ()
main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert
Sorry for the noise,
L.
On Wed, Mar 21, 2012 at 10:02 AM, Lorenzo Bolla <lbolla at gmail.com> wrote:
>
>
> On Wed, Mar 21, 2012 at 9:27 AM, Heinrich Apfelmus <
> apfelmus at quantentunnel.de> wrote:
>
>> Adrien Haxaire wrote:
>>
>>> I tried with foldl'. I modified the code at several places to match
>>> the argument pattern, and now I see why flip is so useful :) The
>>> conclusion is also interesting: the productivity climbs up to 92%,
>>> while the calculation time raises to 6.3s. I guess that the choice is
>>> space or time, as often.
>>>
>>
>> 92% productivity seems right for me. In contrast, 20% garbage collection
>> may be a sign that something went wrong.
>
> I think that this is likely due to laziness: in the very end, you only
>> query the rightmost element. After a while, the program simply won't
>> evaluate the balancing on the left side of the tree, as you're not asking
>> it to evaluate anything there.
>>
>>
>> So, you're not necessarily comparing apples and apples here. But on the
>> other hand, maybe that's a performance disadvantage of the C++ version. In
>> Haskell, performance depends a lot on usage patterns.
>>
>>
> This is very true.
> In fact, after some tweaking, I found that the best solution is using
> foldl', lazy type and force some strictness in "insert" using "seq". See
> below:
>
> import Data.Foldable (foldl', foldr')
>
> data Color = Red | Black deriving (Show)
>
> data Tree a = Empty | Node Color (Tree a) a (Tree a)
> deriving (Show)
>
> insert :: Ord a => a -> Tree a -> Tree a
> insert x t = makeBlack (ins t)
> where
> ins Empty = Node Red Empty x Empty
> -- ins (Node color a y b) | x < y = ins a `seq` balance
> color (ins a) y b
> -- | x == y = Node color a y b
> -- | x > y = ins b `seq` balance
> color a y (ins b)
> ins (Node color a y b) | x < y = balance color (ins a) y b
> | x == y = Node color a y b
> | x > y = balance color a y (ins b)
>
> makeBlack :: Tree a -> Tree a
> makeBlack (Node _ a y b) = Node Black a y b
> makeBlack Empty = Empty
>
> balance :: Color -> Tree a -> a -> Tree a -> Tree a
> balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance color a x b = Node color a x b
>
> maxTree :: Ord a => Tree a -> a
> maxTree (Node _ Empty n Empty) = n
> maxTree (Node _ _ _ t) = maxTree t
>
> toInsert :: [Int]
> -- toInsert = [1..1000000]
> toInsert = map (`mod` 100) [1..10000000]
>
> main :: IO ()
> main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert
>
>
> Note that if the improvement is around 10% for "toInsert" being a
> monotonic sequence of integers, the improvement is much bigger (>2x for me)
> for a more "random" "toInsert" sequence.
>
> L.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120321/e3259421/attachment-0001.htm>
More information about the Beginners
mailing list