[Haskell-cafe] inversion lists (was: O(n) algorithm for determining subset)

Mon Nov 16 11:27:40 EST 2009

On Mon, 16 Nov 2009 00:03:54 +0300 Eugene Kirpichov <ekirpichov at gmail.com> wrote: 

EK> 2009/11/15 Michael Mossey <mpm at alumni.caltech.edu>:
>> Can someone tell me if this is correct. I'm guessing that if I represent
>> two sets of integers by Word32 (where ith bit set means i is in the set),
>> then an algorithm to determine if x is a subset of y would go something
>> like
>> 
>> 
>>  (y - x) == (y `exclusiveOr` x)

EK> It's simpler: "x&~y == 0"

Depending on the OP data set the simple bit vector may be a good
strategy, but I wonder if an inversion list would be better?  Do you
expect long runs of adjacent numbers and gaps?  Inversion lists tend to
encode those much more efficiently than bit vectors.

I'm just now learning Haskell so I don't know enough to show an
implementation but inversion lists are very simple conceptually.  If
your set is (2,3,4,5,8,9,10) the inversion list would be (2,6,8,11).
It's easy to generate it from a set of Ord elements.

I'd be curious to see an implementation of the above and other set
operations with inversion lists, in fact, so I can learn from it.  I
couldn't find any implementation online for Haskell.

Thanks
Ted