DData
Adrian Hey
ahey at iee.org
Thu May 20 10:18:14 EDT 2004
On Wednesday 19 May 2004 11:28 am, Ross Paterson wrote:
> On Mon, May 17, 2004 at 09:39:25AM +0100, Adrian Hey wrote:
> > Here are the results for 2 tests..
> >
> > Test1: Build a tree/finitemap of 1000000 pairs.
> > Test2: As Test1, followed by deletion of all 1000000 pairs.
> >
> > Pairs are inserted/deleted in numerical order [1,2..1000000]
>
> Inserting in random order gives a less dramatic difference.
Yes, I considered this, but decided to do it with sorted data in
order to give whatever balancing mechanism is employed a hammering.
> I changed your testData to
>
> testData = take 100000 $ randomRs (0, 2^30-1) (mkStdGen 7)
>
> (I have a slower machine), changed isEmpty to size to guarantee
> strictness,
Actually, it shouldn't be necessary to do this (for AVL trees
at least). They should behave as being strict in their recursive
fields, and indeed were at one time. But I discovered this actually
slowed things down compared with using explicit `seq` in the code
to force evaluation (slow down was not huge, about 5% or so).
> and got the following times (averaged over 50 runs each):
>
> ins ins+del
> Data.FiniteMap 4.783 8.304
> Data.Tree.AVL 4.561 6.895
> DData.Map 4.765 7.369
> DData.IntMap 4.952 7.742
Unfortunately, I think a large part of the time here is probably accounted
for by the time taken to generate the test data itself, if the results
on my machine are anything to go by. Using your test data increases the
execution time of the insertion test (for 100000 pairs) from 0.39
seconds to 1.88 seconds for AVL trees.
Anyway, FWIW here's the results on my machine (using isEmpty, not size)
For 100000 sorted pairs,
ins ins+del
Data.FiniteMap 0.83 1.18
Data.Tree.AVL 0.39 0.71
For 100000 random pairs,
ins ins+del
Data.FiniteMap 2.04 3.80
Data.Tree.AVL 1.88 3.16
For 1000000 random pairs,
ins ins+del
Data.FiniteMap 32.1 69.4
Data.Tree.AVL 27.0 54.0
Sorry, I don't have DData installed at the moment :-(
Hmm, dunno what to make of these huge times, other than it seems
generating the test data costs more than using it :-). There still
seems to be about 5 seconds difference between the two for insertion
though.
Regards
--
Adrian Hey
More information about the Libraries
mailing list