Feedback request: priority queues in containers

Tue Mar 16 09:29:06 EDT 2010

Hey,

I'd like to request some more feedback on the
proposed<http://hackage.haskell.org/trac/ghc/ticket/3909>implementation
for priority queues in containers.  Mostly, I feel like
adding a new module to containers should be contentious, and there hasn't
been as much griping or contention as I expected.  The silence is feeling
kind of eerie!

I'm inclined to set a deadline of next Wednesday, Mar 24, because the ticket
was started two weeks ago and the current implementation has been
essentially unchanged for a week.  After that point, I'll consider the patch
final.

The proposed implementation benchmarked competitively with every alternative
implementation that we tested, and offers good asymptotics in nearly every
operation.  Specifically, we use a binomial heap, which offers

   - O(log n) worst-case union
   - O(log n) worst-case extract (this in particular was a key objective of
   ours)
   - amortized O(1), worst-case O(log n) insertion.  (In a persistent
   context, the amortized performance bound does not necessarily hold.)

This implementation seems to offer the best balance between practical
performance and asymptotic behavior.  Moreover, this implementation is
extremely memory-efficient, resulting in better performance on large data
sets.  (See the ticket for benchmarks.  The most accurate benchmarks are
towards the end.)

A couple potentially contentious design decisions:

   - There is no distinction between keys and priority values.  A utility
   type Prio p a with the instance Ord p => Ord (Prio p a) is exported to allow
   usage of distinct keys and priority values.
   - Min-queues and max queues are separated the following way:
      - Data.PQueue.Min exports the type MinQueue.
      - Data.PQueue.Max exports the type MaxQueue.  (This is implemented as
      a wrapper around MinQueue.)  The method names are the same, but
I think this
      is acceptable, because I can't think of any algorithms that use
a min-queue
      and a max-queue separately.
      - Data.PQueue adds the alias type PQueue = MinQueue, so that the
      "default" behavior is a min-queue.

These design decisions seem to be sufficient to handle most traditional uses
for priority queues.  In particular, this approach offers a superset of the
functionality offered by Java's built-in priority queue
implementation<http://java.sun.com/javase/6/docs/api/java/util/PriorityQueue.html>,
which makes the same design decisions, but of course, is all imperative and
yucky!  (Also, it offers inferior asymptotics, heh.)

I made a particular effort to offer the sort of utility functions that are
found in the other modules of containers.  In particular, it offers:

   - take, takeWhile, span, and that whole family of functions.  take k q
   returns the *list* of the top k elements, and drop k q returns the *queue*
   with the first k elements deleted.  The rest of these methods have analogous
   signatures.
   - q !! k is equivalent to toAscList q !! k.
   - filter and partition are offered in O(n) time.  (It's actually not
   obvious that my implementation actually runs in O(n) time, but I managed to
   prove it.)
   - We offer Functor, Foldable, and Traversable instances that do not
   respect key ordering.  All are linear time, but Functor and Traversable in
   particular assume the function is monotonic.  The Foldable instance is a
   good way to access the elements of the priority queue in an unordered
   fashion.  (We also export mapMonotonic and traverseMonotonic, and encourage
   the use of those functions instead of the Functor or Traversable instances.)
   - We offer foldrAsc, foldrDesc, foldlAsc, and foldlDesc.
    (Descending-order operations are just implemented as duals of the
   ascending-order operations, for MinQueue.  For MaxQueue, it's the other way
   around.)
   - Correspondingly, we export toList, toAscList, toDescList, fromList,
   fromAscList, fromDescList.  (toList returns an *unordered* traversal, and is
   *not* equivalent to toAscList.)

I'm really satisfied with the patch as-is, modulo maybe tinkering with the
code style a little.  I'm also working on an article for TMR on priority
queues in Haskell, some of the different structures we considered, and
particularly the new type-safety implementation I came up with for binomial
heaps in the writing of this implementation.

In conclusion, I want to be sure people actually like this approach!  So
check it out.  Complaints are appreciated, but even "I think your
implementation is absolutely perfect" would reassure me. =)

Louis Wasserman
wasserman.louis at gmail.com
http://profiles.google.com/wasserman.louis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20100316/36eb99e1/attachment.html