Fwd: Improving containers library

Louis Wasserman wasserman.louis at gmail.com
Thu Mar 4 09:54:03 EST 2010


Whoops, meant to send this to the whole libraries list.

What do people think of this implementation and approach?  It's got O(log n)
union and all.  I think it could even be made stable, with some thought...

Louis Wasserman
wasserman.louis at gmail.com
http://profiles.google.com/wasserman.louis


---------- Forwarded message ----------
From: Louis Wasserman <wasserman.louis at gmail.com>
Date: Wed, Mar 3, 2010 at 8:06 PM
Subject: Re: Improving containers library
To: Milan Straka <fox at ucw.cz>


Hokay, check it out: http://hackage.haskell.org/trac/ghc/ticket/3909

I just started this ticket to get going on adding priority queues to
containers.

I spent probably way too much time today working on implementing binomial
queues in Haskell *well*.  I think that while my code needs cleaning up, my
data structures are just about perfect.  The way I wrote it, that my program
type-checks provides serious evidence that it's written correctly.

data PQueue a = Empty | PQueue {-# UNPACK #-} !Int a !(BinomHeap a)
-- It's based on a binomial heap augmented with a global root, so PQueue n x
forest is a queue
-- with size n, with global root x, and with its other elements in the
binomial heap
type BinomHeap a = BinomForest a ()
data BinomForest a ch
= Nil | Skip !(BinomForest' a ch) | Cons !(Bin a ch) !(BinomForest' a ch)
-- BinomForest a ch represents a binomial forest with roots having some
minimum rank k, in which the keys have type 'a'
-- and the set of children of a root of rank k has type 'ch'.  BinomForest'
a ch is the type of a binomial forest, starting at rank
-- k+1.
type BinomForest' a ch = BinomForest a (Children a ch)

data Bin a ch = Bin a ch
-- If ch corresponds to the children of a binomial node of rank k, then this
is a binomial node of rank k.
data Children a ch = Children {-# UNPACK #-} !(Bin a ch) ch
-- At rank k, ch will have the form Children a (Children a (... (Children a
())...)) with depth k.  Then values of this type,
-- written out, look like Children tk1 (Children tk2 (....(Children tkk
())...)), where tki is a binomial tree of rank (k-i).
-- This is exactly how the children of a binomial tree of rank k work.

Observe that subtrees in Children are written in descending order of rank,
while subtrees in BinomForest are written in ascending order of rank.  When
we delete-min on a binomial heap, we'll need to merge the children of the
smallest root with the rest of the roots, but merging only works when we
have two ascending-rank sequences.  This might cause some problems, except
for a lovely workaround:  we merge the children of the smallest root into
the rest of the roots as we work backwards.  With laziness, we can get away
without even knowing whether or not our current candidate is the smallest
root.

As a result, we get something that type checks perfectly -- which guarantees
that roots of the correct rank are being assembled properly.  Furthermore,
we maintain all of the amortized bounds, without problems.

I'm not entirely sure how different the performance would be in a more
traditional, less type-safe implementation -- for one thing, we might be
able to skip the "Skip" constructor in BinomForest, so when the heap had 2^n
elements we'd only have to traverse a single root.  However, this approach
allows a lot of constructors to be unpacked nicely, which has nontrivial
advantages, to say nothing of the correctness guarantees.


Louis Wasserman
wasserman.louis at gmail.com
http://profiles.google.com/wasserman.louis


On Wed, Mar 3, 2010 at 1:29 PM, Louis Wasserman
<wasserman.louis at gmail.com>wrote:

> Hehehehe.
>
> Actually, in retrospect, most of the data structures in containers are
> relatively strict -- they typically won't defer work, and support worst-case
> performance as opposed to amortized.  (Sequence is the exception, but
> worst-case O(log n) is much nicer than worst-case O(n)).
>
> My thinking now is along the lines of a relatively strict binomial heap
> implementation, with maybe an extra module for pairing heaps when worst-case
> runtime is less important than overall speed.
>
> Louis Wasserman
> wasserman.louis at gmail.com
> http://profiles.google.com/wasserman.louis
>
>
> On Wed, Mar 3, 2010 at 1:16 PM, Milan Straka <fox at ucw.cz> wrote:
>
>> Hi,
>>
>> if I understand the function "fusing" correctly, you use the multi-pass
>> variant of a pairing heap? Personally I thought more in the lines of
>> a two-pass lazy variant of pairing heap mentioned in Okasaki's book.
>> And there are skew heaps... Well, we'll let a benchmark do the judging :)
>>
>> Thanks,
>> Milan
>>
>> > Yo,
>> >
>> > If you're interested in adding priority queues to containers, shameless
>> > plug: I've got a good implementation of pairing heaps in
>> > http://hackage.haskell.org/package/queuelike.  It's a bit obfuscated
>> right
>> > now, but I'd definitely be interested in producing something that's
>> actually
>> > readable and usable enough to be put into containers...
>> >
>> > Louis Wasserman
>> > wasserman.louis at gmail.com
>> > http://profiles.google.com/wasserman.louis
>> >
>> >
>> > On Wed, Mar 3, 2010 at 11:28 AM, <libraries-request at haskell.org> wrote:
>> >
>> > > Re: Improving containers library
>> > >
>>
>> > _______________________________________________
>> > Libraries mailing list
>> > Libraries at haskell.org
>> > http://www.haskell.org/mailman/listinfo/libraries
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20100304/a5f02a15/attachment.html


More information about the Libraries mailing list