Proposal: Remove Semigroup and Monoid instances for Data.Map, Data.IntMap, Data.HashMap

David Feuer david.feuer at gmail.com
Tue Feb 13 20:34:00 UTC 2018


I don't want to maintain a whole package of newtype-wrapped everything
for this, but I gather that Ben Gamari already has one people can use
if they like. Now that Semigroup is in base, I think Kris's idea of
using newtype wrappers *permanently* might be a good idea. That way we
can have

newtype UnionCombine k v = UnionCombine (Map k v)
UnionCombine xs <> UnionCombine ys = UnionCombine $ unionWith (<>) xs ys

newtype IntersectionCombine k v = IntersectionCombine (Map k v)
IntersectionCombine xs <> IntersectionCombine ys = IntersectionCombine
$ intersectionWith (<>) xs ys

We can add such newtypes (by whatever names people prefer)
immediately, and kick the question of whether to ever reinstate the
instances down the road.

On Tue, Feb 13, 2018 at 3:16 PM, Kris Nuttycombe
<kris.nuttycombe at gmail.com> wrote:
> Given that we're talking about a years-long process, what about using the
> newtype approach as an intermediate step, so that we don't get additional
> proliferation of third-party newtype wrappers in the interim? Deprecate the
> current instances and add instances using newtype wrappers at the same type,
> such that a newtype'd version of the current functionality is available to
> folks as a migration path, as well as a newtype for which the instances
> provide the correct behavior. Then, in the future, the newtypes for the
> corrected behavior will eventually become superfluous, and for those who
> rely on the newtypes that represent the current behavior, they won't have to
> change anything.
>
> On Tue, Feb 13, 2018 at 1:07 PM, Daniel Cartwright <chessai1996 at gmail.com>
> wrote:
>>
>> I am strongly in favour of this as well. I do not like the newtype
>> approach, I think backwards-*incompatibility* with four to five year-old (or
>> two to three year-old) code is OK.
>>
>> On Tue, Feb 13, 2018 at 3:04 PM, Daniel Cartwright <chessai1996 at gmail.com>
>> wrote:
>>>
>>> I am strongly in favour of this as well. I do not like the newtype
>>> approach, I think backwards-*incompatibility* with four to five year-old (or
>>> two to three year-old) code is OK.
>>>
>>> On Tue, Feb 13, 2018 at 2:59 PM, David Feuer <david.feuer at gmail.com>
>>> wrote:
>>>>
>>>> I don't think the old proposal was really the right way.
>>>>
>>>> On Tue, Feb 13, 2018 at 2:54 PM, Mario Blažević <mblazevic at stilo.com>
>>>> wrote:
>>>> > +1. But whatever happened to your proposal from last May? I don't
>>>> > think
>>>> > there were any objections to it. Would the two proposals be combined,
>>>> > or
>>>> > have you decided to drop the previous one?
>>>> >
>>>> > https://mail.haskell.org/pipermail/libraries/2017-May/028036.html
>>>> >
>>>> > On 2017-05-25 12:55 PM, David Feuer wrote:
>>>> >> A lot of people have wrappers around Data.Map and Data.IntMap to give
>>>> >> them more useful (Semigroup and) Monoid instances. I'd like to add
>>>> >> such
>>>> >> wrappers to containers. What we need to be able to do that are
>>>> >> *names*
>>>> >> for the new modules. I can't think of any, so I'm reaching out to the
>>>> >> list. Please suggest names! Another question is whether we should
>>>> >> take
>>>> >> the opportunity of new modules to modernize and streamline the API a
>>>> >> bit. I'd like, at least, to separate "safe" from "unsafe" functions,
>>>> >> putting the unsafe ones in .Unsafe modules.
>>>> >
>>>> >
>>>> >
>>>> > On 2018-02-13 02:33 PM, David Feuer wrote:
>>>> >>
>>>> >> Many people have recognized for years that the Semigroup and Monoid
>>>> >> instances for Data.Map, Data.IntMap, and Data.HashMap are not so
>>>> >> great. In particular, when the same key is present in both maps, they
>>>> >> simply use the value from the first argument, ignoring the other one.
>>>> >> This somewhat counter-intuitive behavior can lead to bugs. See, for
>>>> >> example, the discussion in Tim Humphries's blog post[*]. I would like
>>>> >> do do the following:
>>>> >>
>>>> >> 1. Deprecate the Semigroup and Monoid instances for Data.Map.Map,
>>>> >> Data.IntMap.IntMap, and Data.HashMap.HashMap in the next releases of
>>>> >> containers and unordered-containers.
>>>> >>
>>>> >> 2. Remove the deprecated instances.
>>>> >>
>>>> >> 3. After another several years (four or five, perhaps?), make a major
>>>> >> release of each package in which the instances are replaced with the
>>>> >> following:
>>>> >>
>>>> >>    instance (Ord k, Semigroup v) => Semigroup (Map k v) where
>>>> >>      (<>) = Data.Map.Strict.unionWith (<>)
>>>> >>    instance (Ord k, Semigroup v) => Monoid (Map k v) where
>>>> >>      mempty = Data.Map.Strict.empty
>>>> >>
>>>> >>    instance Semigroup v => Semigroup (IntMap v) where
>>>> >>      (<>) = Data.IntMap.Strict.unionWith (<>)
>>>> >>    instance Semigroup v => Monoid (IntMap v) where
>>>> >>      mempty = Data.IntMap.Strict.empty
>>>> >>
>>>> >>    instance (Eq k, Hashable k, Semigroup v) => Semigroup (HashMap k
>>>> >> v)
>>>> >> where
>>>> >>      (<>) = Data.HashMap.Strict.unionWith (<>)
>>>> >>    instance (Eq k, Hashable k, Semigroup v) => Monoid(HashMap k v)
>>>> >> where
>>>> >>      mempty = Data.HashMap.Strict.empty
>>>> >>
>>>> >> Why do I want the strict versions? That choice may seem a bit
>>>> >> surprising, since the data structures are lazy. But the lazy versions
>>>> >> really like to leak memory, making them unsuitable for most practical
>>>> >> purposes.
>>>> >>
>>>> >> The big risk:
>>>> >>
>>>> >> Someone using old code or old tutorial documentation could get subtly
>>>> >> wrong behavior without noticing. That is why I have specified an
>>>> >> extended period between removing the current instances and adding the
>>>> >> desired ones.
>>>> >>
>>>> >> Alternatives:
>>>> >>
>>>> >> 1. Remove the instances but don't add the new ones. I fear this may
>>>> >> lead others to write their own orphan instances, which may not even
>>>> >> all do the same thing.
>>>> >>
>>>> >> 2. Write separate modules with newtype-wrapped versions of the data
>>>> >> structures implementing the desired instances. Unfortunately, this
>>>> >> would be annoying to maintain, and also annoying to use--packages
>>>> >> using the unwrapped and wrapped versions will use different types.
>>>> >> Manually wrapping and unwrapping to make the different types work
>>>> >> with
>>>> >> each other will introduce lots of potential for mistakes and
>>>> >> confusion.
>>>> >>
>>>> >> Discussion period: three weeks.
>>>> >>
>>>> >> [*] http://teh.id.au/posts/2017/03/03/map-merge/index.html
>>>> >> _______________________________________________
>>>> >> Libraries mailing list
>>>> >> Libraries at haskell.org
>>>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> >>
>>>> >
>>>> > _______________________________________________
>>>> > Libraries mailing list
>>>> > Libraries at haskell.org
>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> Libraries at haskell.org
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>>>
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>


More information about the Libraries mailing list