Proposal: priority queues in containers

Tue Mar 16 09:54:15 EDT 2010

PROPOSAL:  Add a priority queue implementation to the containers package.
 Specific modules will include Data.PQueue.Min, Data.PQueue.Max, and
Data.PQueue.

DEADLINE:  Next Friday, Mar 26.

The ticket, #3909, is here <http://hackage.haskell.org/trac/ghc/ticket/3909>.
 It had done some bouncing around glasgow-haskell-users (mostly because I
forgot how to do a proper libraries proposal submission), and there have
been several implementations benchmarked, compared, and argued about.

DETAILS:

The proposed implementation benchmarked competitively with every alternative
implementation that we tested, and offers good asymptotics in nearly every
operation.  Specifically, we use a binomial heap, which offers

   - O(log n) worst-case union
   - O(log n) worst-case extract (this in particular was a key objective of
   ours)
   - amortized O(1), worst-case O(log n) insertion.  (In a persistent
   context, the amortized performance bound does not necessarily hold.)

This implementation seems to offer the best balance between practical
performance and asymptotic behavior.  Moreover, this implementation is
extremely memory-efficient, resulting in better performance on large data
sets.  (See the ticket for benchmarks.  The most accurate benchmarks are
towards the end.)

A couple potentially contentious design decisions:

   - There is no distinction between keys and priority values.  A utility
   type Prio p a with the instance Ord p => Ord (Prio p a) is exported to allow
   usage of distinct keys and priority values.
   - Min-queues and max queues are separated the following way:
      - Data.PQueue.Min exports the type MinQueue.
      - Data.PQueue.Max exports the type MaxQueue.  (This is implemented as
      a wrapper around MinQueue.)  The method names are the same as
      Data.PQueue.Min, but I think it's acceptable to force qualified
imports when
      using both a max-queue and a min-queue.
      - Data.PQueue adds the alias type PQueue = MinQueue, so that the
      "default" behavior is a min-queue.

These design decisions seem to be sufficient to handle most traditional uses
for priority queues.  In particular, this approach offers a superset of the
functionality offered by Java's built-in priority queue
implementation<http://java.sun.com/javase/6/docs/api/java/util/PriorityQueue.html>,
which makes the same design decisions, but of course, is all imperative and
yucky!  (Also, it offers inferior asymptotics, heh.)

I'm really satisfied with the patch as-is, modulo maybe tinkering with the
code style a little.  I'm also working on an article for TMR on priority
queues in Haskell, some of the different structures we considered, and
particularly the new more-type-safe implementation I came up with for
binomial heaps in the writing of this implementation.

Anyway, discuss, complain, et cetera.

Louis Wasserman
wasserman.louis at gmail.com
http://profiles.google.com/wasserman.louis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20100316/54142a5e/attachment.html