[Haskell-cafe] ANN: signed-multiset-0.1

wren ng thornton wren at freegeek.org
Thu Apr 26 05:44:19 CEST 2012


On 4/25/12 5:39 AM, Stefan Holdermans wrote:
> The union of two sets is typically defined as the smallest set that is a superset of both the operands;

Or, the smallest set containing all the elements of both/all operands.

The two definitions coincide for sets.

They diverge for bags/multisets: yours uses maximization whereas mine 
uses addition[1]. Notably, both operations are monoids with zero being 
the identity element. Thus it is sensible for zero to denote 
non-membership in both versions. Both are natural extensions of the case 
for sets; which is "more natural" is subject to debate and context. 
(I've always considered "bags" and "multisets" to be synonymous, though 
perhaps others mean them to distinguish these two different ideas of 
generalizing sets.)

They continue to diverge when multiplicities can be negative; here the 
use of zero becomes an issue. For the additive definition, zero is still 
okay for denoting non-membership because zero is the identity for 
addition on integers (as well as the identity for addition on naturals). 
This is a natural extension of the additive case for bags, we're just 
adding closure for subtraction. However, while zero is the identity for 
maximization on naturals, there is no identity for maximization on the 
integers. You'd have to use the integers plus negative infinity in order 
to have a monoid.


[1] Actually, maximization only works out if we assume that the 
arguments are "maximally non-disjoint" ---i.e., share elements whenever 
possible--- whereas addition only works out if we assume the arguments 
are disjoint. These are the two extreme cases, and therefore the most 
interesting, though it's possible to get any multiplicity between those 
extremes. Assuming sets to be disjoint is a common thing, however; 
whereas I don't know of any set-theoretic way of defining "maximally 
non-disjoint" to be anything meaningful.


> I do not recall claiming maximisation to be "natural"; what's natural for one application, may not be for another.

I'm only using natural in the mathematical sense (i.e., as the "obvious" 
extension of something). And you did claim maximization was the 
obvious/appropriate way to extend the case for sets.

I'm fine with the claim that maximization is natural for bags/multisets. 
It's just that addition is natural there too. Sets are where lattice 
theory and ring theory meet; so there's no a priori reason to say that 
generalizations of sets belong more to one theory than the other. But 
the one thing both theories agree on is the importance of monoids.

The problem with the further extension from bags/multisets to signed 
multisets is that the difference between lattices and rings becomes a 
lot more apparent. You can make maximization work in a sensible way, and 
it's a desirable thing to have; you just can't make it work with zero 
because zero has no special significance to the maximization function on 
integers.

> That said, my goal was never to upload a package that only leads to confusion and false expectations. As the common opinion seems to be that a maximising union is bad, I will drop it from the library and deprecate the package on Hackage.

Deprecating the package seems a bit excessive. Why not just extend the 
documentation to explain exactly what properties the two variants have 
and why different people might want one over the other? Why not just 
rename "union" (and "map") to something else, thereby giving neither 
variant the default status? Users can always define union = theUnionIWant.


> [1] Wayne D. Blizard: Multiset Theory. Notre Dame Journal of Formal Logic 30(1): 36-66 (1989)
> [2] Wayne D. Blizard: Negative Membership. Notre Dame Journal of Formal Logic 31(3): 346-368 (1990)
> [3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358

If you take a look at Syropoulos's definition 12 (the subhybridset 
definition of Loeb) you can see that there's something funny going on. 
In particular, the definition of (<<) cannot be correct because 
Syropoulos says:

     i << j = (i <= j) && (i < 0 || 0 <= j)

But that's identical to just using (<=) in the first place! I think what 
Syropoulos/Loeb means to say instead is that:

     i << j = (abs i <= abs j) && (i < 0 || 0 <= j)

In which case you actually get a partial order where zero does have 
significance. Though it's a rather peculiar ordering, so maybe they mean 
something else again.


Also, lest you think the rest of us are crazy, all the other authors 
I've seen use "union" to mean what Syropoulos calls "sum" (defn.7), and 
use some other terminology like "minimal-union" to mean what Syropoulos 
calls "union" (defn.9). E.g.,

 
http://www.cs.utexas.edu/~moore/acl2/workshop-2002/contrib/martin-alonso-hidalgo-ruiz/generic-multiset.pdf

Which came up when trying to locate the papers you mentioned.

-- 
Live well,
~wren



More information about the Haskell-Cafe mailing list