wren ng thornton wren at freegeek.org
Thu Apr 26 05:44:19 CEST 2012

```On 4/25/12 5:39 AM, Stefan Holdermans wrote:
> The union of two sets is typically defined as the smallest set that is a superset of both the operands;

Or, the smallest set containing all the elements of both/all operands.

The two definitions coincide for sets.

They diverge for bags/multisets: yours uses maximization whereas mine
uses addition[1]. Notably, both operations are monoids with zero being
the identity element. Thus it is sensible for zero to denote
non-membership in both versions. Both are natural extensions of the case
for sets; which is "more natural" is subject to debate and context.
(I've always considered "bags" and "multisets" to be synonymous, though
perhaps others mean them to distinguish these two different ideas of
generalizing sets.)

They continue to diverge when multiplicities can be negative; here the
use of zero becomes an issue. For the additive definition, zero is still
okay for denoting non-membership because zero is the identity for
addition on integers (as well as the identity for addition on naturals).
This is a natural extension of the additive case for bags, we're just
adding closure for subtraction. However, while zero is the identity for
maximization on naturals, there is no identity for maximization on the
integers. You'd have to use the integers plus negative infinity in order
to have a monoid.

[1] Actually, maximization only works out if we assume that the
arguments are "maximally non-disjoint" ---i.e., share elements whenever
possible--- whereas addition only works out if we assume the arguments
are disjoint. These are the two extreme cases, and therefore the most
interesting, though it's possible to get any multiplicity between those
extremes. Assuming sets to be disjoint is a common thing, however;
whereas I don't know of any set-theoretic way of defining "maximally
non-disjoint" to be anything meaningful.

> I do not recall claiming maximisation to be "natural"; what's natural for one application, may not be for another.

I'm only using natural in the mathematical sense (i.e., as the "obvious"
extension of something). And you did claim maximization was the
obvious/appropriate way to extend the case for sets.

I'm fine with the claim that maximization is natural for bags/multisets.
It's just that addition is natural there too. Sets are where lattice
theory and ring theory meet; so there's no a priori reason to say that
generalizations of sets belong more to one theory than the other. But
the one thing both theories agree on is the importance of monoids.

The problem with the further extension from bags/multisets to signed
multisets is that the difference between lattices and rings becomes a
lot more apparent. You can make maximization work in a sensible way, and
it's a desirable thing to have; you just can't make it work with zero
because zero has no special significance to the maximization function on
integers.

> That said, my goal was never to upload a package that only leads to confusion and false expectations. As the common opinion seems to be that a maximising union is bad, I will drop it from the library and deprecate the package on Hackage.

Deprecating the package seems a bit excessive. Why not just extend the
documentation to explain exactly what properties the two variants have
and why different people might want one over the other? Why not just
rename "union" (and "map") to something else, thereby giving neither
variant the default status? Users can always define union = theUnionIWant.

> [1] Wayne D. Blizard: Multiset Theory. Notre Dame Journal of Formal Logic 30(1): 36-66 (1989)
> [2] Wayne D. Blizard: Negative Membership. Notre Dame Journal of Formal Logic 31(3): 346-368 (1990)
> [3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358

If you take a look at Syropoulos's definition 12 (the subhybridset
definition of Loeb) you can see that there's something funny going on.
In particular, the definition of (<<) cannot be correct because
Syropoulos says:

i << j = (i <= j) && (i < 0 || 0 <= j)

But that's identical to just using (<=) in the first place! I think what
Syropoulos/Loeb means to say instead is that:

i << j = (abs i <= abs j) && (i < 0 || 0 <= j)

In which case you actually get a partial order where zero does have
significance. Though it's a rather peculiar ordering, so maybe they mean
something else again.

Also, lest you think the rest of us are crazy, all the other authors
I've seen use "union" to mean what Syropoulos calls "sum" (defn.7), and
use some other terminology like "minimal-union" to mean what Syropoulos
calls "union" (defn.9). E.g.,

http://www.cs.utexas.edu/~moore/acl2/workshop-2002/contrib/martin-alonso-hidalgo-ruiz/generic-multiset.pdf

Which came up when trying to locate the papers you mentioned.

--
Live well,
~wren

```