Proposal: priority queues in containers
Louis Wasserman
wasserman.louis at gmail.com
Thu Mar 18 11:43:43 EDT 2010
Oh, god, so much to respond to...heh.
Submit this package for canonicalization as part of the Haskell Platform. I
> would for one would support its inclusion.
This is an option I seriously hadn't considered. To be fair, that's because
I've never used the Platform myself, preferring rather to have the most
up-to-date version of GHC at all times, heh. That said, while I'd be okay
with this option, I'd prefer putting it into containers, because I feel like
a canonical, reliable priority queue implementation is the sort of thing a
self-respecting language ought to have built in.
As does Python. In Python, though, the PQ implementation is not a built-in
> class in the default namespace (as are dict and set). Rather, it is one of
> the "batteries included" libraries that come with Python. I think that's the
> right place for it in Haskell, too.
I don't know Python, but according to Wikipedia, dict and set are built into
the language. I don't think it's a fair comparison: set and dict in Python
seem to have a role almost as ubiquitous as [] in Haskell, much more
ubiquitous than e.g. Data.Set or Data.Map. I'm also not entirely sure that
"batteries included" doesn't describe containers, given all the other
packages that come with GHC.
> * There is no distinction between keys and priority values. A utility
> type
> Prio p a with the instance Ord p => Ord (Prio p a) is exported to
> allow
> usage of distinct keys and priority values.
I disagree with this one. It requires an Ord instance that isn't really an
> ordering, and makes a Functor instance impossible. I would prefer an
> interface separating keys and values like that of Data.Map (which would also
> increase consistency within the package).
I'd be okay with separating out a priority/value version. However, I'm
still clueless as to what you're talking about with respect to Functor --
what's wrong with the following?
data Prio p a = Prio p a
instance Ord p => Ord (Prio p a) where ...
instance Functor (Prio p) where fmap f (Prio p a) = Prio p (f a)
I can understand if you're uncomfortable with (==) and (\ x y -> compare x y
== EQ) being inequivalent, but neither the H98 Report nor the Prelude make
any such claim, as far as I can tell.
The Foldable instance breaks the abstraction. I think it should
> process elements in order.
I think that wanting to iterate over the priority queue in the middle of the
computation, without caring about order, is a perfectly legitimate desire
for a programmer! Moreover, the only way to make a Foldable instance
process elements in order would be something like
data Ord a => PQueue a = ....
which I think is an awfully dirty approach. I'm not at all a fan of adding
restrictions like that, not least because it adds lots of awkward overhead.
Would you be okay with not exporting a Foldable instance at all, but still
exporting a toList method which doesn't guarantee any ordering on the return
list?
My advice would be not to stress over whether priority queues go into
> containers. It's not some pristine thing of beauty that deserves treatment
> with velvet gloves.
I'm...not sure how to respond to this claim. At least part of me wants to
say, I genuinely do think the containers package is a piece of art...and
then another part pipes up, "except for the inconsistencies between the
various minView/maxView versions, and the little differences between IntMap
and Map, and..." That said, I wouldn't be a fan of scrapping the style
which the containers package has at least tried to create. I could be
convinced that rewriting the rest of containers would be a good thing to do,
though...and I might just want to do that myself. Hah.
How does this implementation compare with using Map/Set as a
> priority queue?
Continuing the discussion of the benchmarks: first, Jim, it appears that I'm
the one who made a n00b benchmarking error. TT_____TT That said, as you
found, my implementation is still slightly faster when the benchmark is
corrected. Some comments:
- QuickBinom needs to have O(1) findMin for a reasonable comparison. I
added that in the benchmark below, and here.
- I can't think of any more optimizations for the sparse binomial heap --
I genuinely think it's not going to get better.
- There is a way to optimize my implementation still further, but it
makes my code much less readable. (Specifically, I start the BinomForest at
Succ Zero, and unpack the first child of every node still in the forest.
Modifying the whole implementation that way, though, makes it unreadably
ugly...and I think QuickBinom is possibly already at that point. I started
implementing it, and realized just how ugly it was, and I stopped, but I can
finish it if I have to.)
Sorting 500,000 elements, compiled with -O2, run with +RTS -H128m -K64m,
with another few implementations thrown in for good measure:
Times (ms)
min mean +/-sd median max
Pairing: 1440.090 1482.893 31.501 1482.093 1532.095
Binomial: 1356.084 1389.687 26.881 1388.087 1436.090
SBinomial: 1376.086 1422.489 48.453 1400.088 1520.095
Data.Set: 1712.107 1800.912 74.880 1782.111 1976.123
Skew: 1584.099 1644.503 85.702 1602.101 1848.116
Some other benchmarks<http://hackage.haskell.org/trac/ghc/attachment/ticket/3909/plot_2.png>were
done by Milan Straka earlier, when we hadn't decided what heap
implementation to use at all. His comments:
> I think the pairing heaps are out of the question now. I would choose
> between Binomial and Leftist. The Binomial have O(1) amortized inserts, but
> beware, that this does not work in persistent setting -- the insert is O(log
> N) when the heaps are used persistently (there are Skew Binomial Heaps with
> O(1) insert in the persistent setting too).
I vote for current (v5) Binomial implementation, even if the O(1)
> amortized inserts works only non-persistently (ie. under heavy persistent
> usage, leftist heaps might be a _little_ better).
Conclusions: There aren't any differences in asymptotics, and as it stands,
the existing implementation is just as fast. It's also a) done, and b) full
of Haskellish type goodness.
After about five hours' work (!!!) I *finally* managed to install Criterion
yesterday, so I'll send out those tests ASAP.
Louis Wasserman
wasserman.louis at gmail.com
http://profiles.google.com/wasserman.louis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20100318/84893357/attachment.html
More information about the Libraries
mailing list