> I've attached a revised implementation. With my benchmark it gives a
> stack overflow:

it may be in replicate. check with

let incRef = atomicModifyIORef r (\a -> (a,a))

> As a side note, it's necessary to add parallelStop, to kill all the
> threads - or you get thread blocked exceptions being raised.

alternatively, you can catch this exception in addWorker

