A more useful Monoid instance for Data.Map

Henning Thielemann lemming at henning-thielemann.de
Tue Mar 12 23:29:30 CET 2013


On Tue, 12 Mar 2013, Milan Straka wrote:

> Hi all,
>
>> -----Original message-----
>> From: Ben Gamari <bgamari.foss at gmail.com>
>> Sent: 12 Mar 2013, 11:06
>>
>> <snip>
>>
>> We should take advantage of this opportunity to begin an easy cleanup
>> and accept that there are still other issues that will need to be dealt
>> with once the tools for doing so exist.
>
> IIRC, the instances we are talking about are
>
>
> Current:
>  instance (Ord k) => Monoid (Map k v) where
>    mempty  = empty
>    mappend = union
>    mconcat = unions
>
>
> Proposed:
>  instance (Ord k, Monoid v) => Monoid (Map k v) where
>    mempty            = empty
>    mappend map0 map1 = unionWith mappend map0 map1
>    mconcat maps      = foldr (unionWith mappend) empty maps
> or even
>  instance (Ord k, Semigroup v) => Monoid (Map k v) where
>    mempty            = empty
>    mappend map0 map1 = unionWith <> map0 map1
>    mconcat maps      = foldr (unionWith <>) empty maps
>
>
> I am not convinced that the proposed instance is universally the best
> one. I like the current instance -- I sometimes use it on
> Map k v where v is a "primitive" type like Int or Double for which no
> Monoid instance exists. Personally, I am also not sure that
>  fromList [(1, "A")] `mappend` fromList [(1, "B")
> should really be
>  fromList [(1, "AB")]
>
> The price for any change is very high:
> a) silent semantic change for replacing the instance is a no go,
>   consider for example for Map k String or Map k ByteString.
> b) removing and then adding the instance means we will have two major
>   version bumps. That places a burden on every user of the library to
>   at least upgrade upper bounds. And as Edward Kmett wrote earlier
>   today, the users that want to use the Monoid instance must revert to
>   some kind of conditional compilation.
>
> I am not saying we should never do any breaking changes, but in this
> case, the benefit seems rather small to me.


First of all, I do not know, why people require the current semantics of 
"mappend" for Data.Map and why they do not just use "Map.union" in order 
to explicitly tell that they drop duplicate elements intentionally.

The best semantics for "mappend" should be driven by the question what is 
best for working with monoids (e.g. as writer type in the Writer monad). 
The original proposal gave the good argument that by choosing a monoid 
type for the Map element you can choose how elements for duplicate keys 
are merged.

My experience with Map.union and Map.fromList is that I had bugs that were 
very hard to debug because I did not kept the case of duplicate keys in 
mind. Now I have added warnings to HLint for every use of Map.union and 
Map.fromList. Either duplicate keys are allowed, then I should explicitely 
state how to combine them, or keys must be distinct, then I should use 
Map.unionWith (error "duplicate keys"). Unfortunately, my HLint checks are 
bypassed if the 'Map.union' is disguised as a 'mappend'.

I think that changing 'mappend' in the proposed way has the advantage to 
make programmers think about duplicate keys. The answer could be to 
replace 'mappend' by 'Map.union' or choose an element Monoid for combining 
the elements. Additionally Semigroup/Monoid types like First and Last are 
more explicit than the current behaviour of "mappend" where you cannot 
guess whether the left or the right element is kept when keys clash.



More information about the Libraries mailing list