[Haskell-cafe] Performance of functional priority queues

Mon Dec 28 15:05:01 EST 2009

On Monday 28 December 2009 18:04:32 Gautam bt wrote:
> On Tue, Dec 29, 2009 at 12:04 AM, Jon Harrop <jon at ffconsultancy.com> wrote:
> > Forcing the evaluating of a thunk replaces the unevaluated expression
> > with the value it evaluates to. That is effectively in-place mutation.
>
> How can one use that to gain on efficiency?

In a purely functional setting?

> I understand that laziness 
> allows a modified data structure to share nodes with the original data
> structure preventing unnecessary copying,

I think you mean that idiomatic purely functional data structures evade 
copying the entire structure by breaking it down recursively and referring 
back to older versions whenever possible (i.e. sharing).

That has nothing to do with laziness though and, compared to the mutable case, 
it incurs a lot of extra copying. In fact, that is half the reason why purely 
functional data structures are so slow. The other half is that recursive 
decomposition leads to many levels of indirection which is hugely inefficient 
on modern computers due to cache misses.

The main use case where purely functional data structures pay dividends in 
terms of performance is persistence. For example, when you backtrack in an 
n-queens solver you create a new, slightly-different list and forget about 
the old one (which gets recycled). This style of programming is very common 
in metaprogramming such as compilers, interpreters and theorem provers.

> but I do not see how forcing an 
> evaluation can be used to gain on efficiency (or alternatively prevent
> inefficiency).

An example of preventing an inefficiency in the purely functional case is 
easy: consider two threads about to perform the same computation. Making them 
compete to force a thunk can prevent them from redundantly performing the 
same computation.

The downside is that the implementation of laziness must deal with race 
conditions and this is neither easy nor efficient.

> Is there any simple example to illustrate this (or should I 
> read Okasaki)?

You should read Okasaki regardless. :-)

A good example from Okasaki might be the purely functional queue. A simple 
eager solution maintains front and back lists, pushes on the back and pops 
from the front except when it is empty, whereupon it reverses the back list 
to create a new front list. Eager evaluation of that list reversal is 
problematic in the presence of persistent use: the programmer might keep the 
same old version of a queue with an empty front list around and pop from it 
several times whereupon the same list reversal will be repeated needlessly 
every time. So non-persistent use of these "batched queues" has good 
amortized complexity but persistent use has awful complexity.

Okasaki presents a "real-time" queue that uses laziness to avoid this 
inefficiency. In essence, the reversal is stored element-wise as thunks that 
are forced only when necessary and their result reused if it is ever required 
again. So it makes persistent use asymptotically more efficient.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e