[Haskell] Incremental Array update (Re: performance tuning Data.FiniteMap)

Mon Mar 1 21:44:47 EST 2004

Hi,

Reading the recent discussions abount Array and FiniteMap, 
I want to know what people think about the current specification of the Array update.

The Array implementaton of GHC effectively has this property (like FiniteMap):
	(arr // cs1) // cs2 == arr // (cs1 ++ cs2) -- allow duplicate indices

Because (//) copies a whole array and slow, instead of the successive updates, I usually accumulate those changes to a list as many as I can in advance, and then give it to (//).
As far as GHC is concerned, all we have to do is to concatenate the list of changes by (++) thanks to the propety.  It generally results in a huge performance inprovement.

However, 
Haskell Report reads (16.2):
> (As with the array function, the indices in the association list must be unique for the updated elements to be defined.) 
So, we must somehow (as oleg's add_uniq[1]) remove the preceding elements having the same index from the list 
to make the program comply Haskell 98 and support Hugs since Hugs does implement that check.  The preconditioning (and also the check performed by the runtime system) causes no small amount of performance hit.

I think it would be better if the above property of incremental updates of Array
is guaranteed sometime hopefully in Haskell 2.

Note:
0) Array give us O(1) read.
1) FiniteMap has the same property.
2) Array is more convenient than monadic arrays, e.g. STArray.
3) Array is faster than ST(U)Array in my experience, if the size of the array is no more than about 100 (when the updates can be combined).
4) The runtime optimization which converts "(arr // cs1) // cs2" to "arr // (cs1 ++ cs2)" might be possible?

[1] http://www.haskell.org//pipermail/haskell/2004-February/013724.html

---
Koji Nakahara