Poor array behaviour

Wed Dec 23 17:07:58 EST 2009

On 23/12/09 07:55, Herk, Robert van wrote:
> Hi all,
>
> I stumbled upon something odd with respect to arrays. I know about GHC not doing card marking and traversing whole arrays one each GC for each array with alterations, but still I don't understand the behaviour.
>
> The situation is like this. I am building a compiler. This compiler manipulates a large graph (for optimizations, etc.). This graph is in memory. As the graph is vast, we did a lot of effort to optimize access to it. The nodes in the graph are IORefs to some data structure, and this data structure contains its edges.
>
> Each node stores its edges in buckets. This is because edges have different types (some are control flow links, others are other types of dependencies). Most of the graph manipulations only traverse over one type of edge, so we figured it would be faster to store the edges in buckets to support such queries. Inside the buckets, there are Data.Maps containing the actual edges for that bucket. The keys in this map are the target nodes of that edges, which are IORefsOrds, which are pairs of a unique integers and a IORef, such that they can be ordered and used as keys in a Map. The values are lists of edges to that target.
>
> The weird thing is in the buckets. Per node, all buckets are stored in an array.  We gave each edge type an integer key.
>
> And we use that key as array index to determine the bucket. I've tried implementing this array in two ways:
>
> 1) Each node contains a immutable array with IORefs. The IORefs contain the actual Data.Maps for the buckets. So, for instance, initializing a node looks something like
>
> import Data.Array
>
> uBucket = 8 //There are 8 buckets because there are 8 types of edges.
>
> emptyEdges =
>    do  buckets<- sequence ( take uBucket $ repeat (newIORef Map.empty) )
>          let myArray = listArray (0, 7) buckets
>          return myArray
>
> So in this solution we have an extra layer of indirection, namely the IORefs, but the array is immutable. Because when the edges change for a particular node, we can write directly into the IORef and leave the array untouched.
>
> 2) Each node contains a mutable array that contains the Data.Maps directly. So, for instance, initializing a node looks something like:
>
> import Data.Array.IO
>
> emptyEdges =
>     do myArray<- newArray (0, 7) Map.empty
>          return $! myArray
>
> Of course, one expect 2 to be quickest. However, it turns out that for 2, the application is spending much more time in GC, and __even without GC__ the application is still slower! I find both things rather weird: I know that for huge arrays I am expected to suffer from the missing card-marking "bug", but my array sizes are only 8.

If you have lots of small mutable arrays, that could be a problem.  We 
implement the write barrier differently for arrays vs. IORefs: an IORef 
is only placed in the remembered set (and hence scanned during minor GC) 
if it is modified.  Mutable arrays on the other hand are always in the 
remembered set, but they are marked either clean or dirty.  There's a 
tradeoff here: the approach we take for arrays means that writes are 
quicker, but each array has to be checked in a minor GC.  It works well 
when arrays are few and large, but less well when they are numerous and 
small.

Incedentally, I recently committed changes to add card marking to 
mutable arrays.  In your case it will probably hurt performance rather 
than improve it, though, since you won't benefit from the card table and 
the write barrier is more expensive.

One solution would be to have a new array type optimised for this case. 
  Alternatively we could experiment with the more expensive write 
barrier and see how much it hurts the large-array cases.

I'd like to verify that this is really the problem - any chance you can 
give us a self-contained snapshot of your code that we can use as a 
benchmark?

Cheers,
	Simon