DData

Tue May 25 11:25:01 EDT 2004

On 24 May 2004 12:03, Adrian Hey wrote:

> On Thursday 20 May 2004 5:04 pm, Simon Marlow wrote:
>> And after eliminating some bogosity in the benchmark, I now get:
>> 
>>                Insert   Lookup   Delete
>> AVL            0.87     0.22     0.87
>> FiniteMap      1.07     0.17     1.11
>> DData.Map      1.08     0.16     1.02
>> DData.IntMap   0.65     0.16     0.61
>> 
>> So I declare that on random data, DData.Map has essentially the same
>> performance as FiniteMap.  IntMap is faster for insertion/deletion as
>> expected. AVL is better than FiniteMap for insertion/deletion but
>> slower for lookups.
> 
> Could you explain the bogosity you eliminated?

I did a full GC before running each part of the test to keep random GC
effects to a minimum, and eliminated a couple of cases of badly
optimised code in the test program.

> The reason I ask is that those second results for lookup for AVL
> trees seem pretty reasonable to me, considering the AVL library is
> general purpose 
> and not specialised for any kind of map. Currently the AVL
> implementation will have to read 3 heap records per tree node
> visited, vs. 2 for FiniteMap and DData.Map.
> 
> For the balanced tree implementations, I wouldn't expect to see any
> significant difference in lookup times (given the same degree of
> specialisation) and what differences there may be would most likely
> be due to differences in how well they managed to balance trees. (But
> I guess if random data is to be our benchmark, even simple unbalanced
> trees would probably look quite reasonable :-)

Of course, this isn't testing how well the balancing works.  It is
comparing raw performance of several balanced tree implementations, so
measuring the overhead of balancing, if you like.

> DData.IntMap is rather different from the other three, so I'm not
> sure exactly how well I would expect it to perform, but if you assume
> lookup time is proportional to the No. of heap records examined
> (seems to be reasonable comparing AVL and FiniteMap/DData.Map), then
> you'd expect a lookup time of about 0.08 for each if they were
> specialised for Int maps. 

Yes, IntMap should get faster lookup.  I looked at the code and managed
to make a couple of improvements: adding a 'seq k $' at the beginning of
lookup helps, and I fixed a small performance problem with the Word
library, but interestingly these don't help the results.  I still get
0.16s for the lookup test, so perhaps it is dominated by something else.

Cheers,
	Simon