Proposal: Remove Semigroup and Monoid instances for Data.Map, Data.IntMap, Data.HashMap

Oleg Grenrus oleg.grenrus at iki.fi
Wed Feb 14 15:48:36 UTC 2018


I don't like a "no-instances" window. We can change the instances to correct
ones directly. That's what semantic versioning is about, to be able to
communicate semantic changes.

We should go directly to

   Semigroup v => Semigroup (Map k v)
   Semigroup v => Monoid (Map k v)

If `containers-0.6` or `containers-1.0` included some other (compile-time)
breaking change: even better.

I want to point out that a lot of code *will* break at compile-time with the
change.  In polymorphic setting `v` doesn't have `Semigroup v`
attached.  Many
concrete types arent `Semigroup` (e.g. `Int`; here and later when I say
isn't
`Semigroup`, I mean that it isn't `Monoid` either).

I tried to compile our `futurice-prelude` [1] and one small size internal
app (40 modules) with patched containers: things break at compile time

This change *won't get thru unnoticed*. See below for details.

Also people (we and others) use `mempty` for `Map.empty`, but for `mempty`
semantics won't change. We might need to use `Map.empty` if `v` isn't
`Semigroup`,
but that's catched by compiler. Note: that would mean to use `Map.empty`
everywhere
if there is no-instances window.

I also conjecture, based on the results, that `Ord k => Monoid (Map k
v)` for
`union` is currently avoided in libs. Maybe because it does *the wrong
thing*,
so people don't use it. In our own 50kLOC codebase we have three
`Map.unionWith` or `Map.unionsWith`, one of which is

    groupThings = Map.unionsWith (<>) . map f).

I think `Map k String` or `Map k Text` things might be common in private
codebases (not ours). Maybe not the most popular opinion, but I think they
deserve to break.


Something else: If we replace the instance, we have to leave it out of
`containers` for older than `base-4.9`.
- Either we provide `Semigroup` (and `Monoid` !) in `semigroups` (See +) or
- just have `base >= 4.9` in that release of `containers`
  (that will be three GHCs soon).


To conclude, The brekage on Hackage would be driven by things not having
`Semigroup` instances, not semantics. There might be subtle breakages,
but not
something we as community cannot handle. So let's just make a change.

Best regards, Oleg

+ At this point we want `Semigroup v` for `Monoid (Map k v)`, and because
  `semigroups` depend on `containers` (and not other way), we'd need to have
  `Monoid (Map k v)` also in `semigroups`.

---

I did an experiment: compile our `futurice-prelude` [1] with amended
`containers`, I changed only instances for `Map`.


Removal of the instances breaks Cabal, so I could test far enough what
happens:

    Distribution/Simple/Program/GHC.hs:546:12: error:
        • No instance for (Semigroup (Map Extension String))
            arising from a use of ‘gmempty’
        • In the expression: gmempty
          In an equation for ‘mempty’: mempty = gmempty
          In the instance declaration for ‘Monoid GhcOptions’
        |
    546 |   mempty = gmempty
        |            ^^^^^^^

This is one place where we want `union` semantics.
OTOH in current development version of Cabal the map is of type

    Map Extension (Maybe Compiler.Flag),

as we moving avoid from stringly-typed programming. And `Flag` doesn't have
`Semigroup` instance.


Second step: Let's see where Monoid (Map k v) is used. We cannot
deprecate the
instance, but we can se where it's used by a -fdefer-type-error trick
(thanks Bartosz Nitka aka niteria [2])
Note that this step actually has changes from third step already, so
we don't see failures where `Semigroup v` context would have caused
compilation
failure.

To my happy surprise the instance were used only once in Cabal-2.0.1.1
(above),
and then in `futurice-prelude` itself where it would also break with
`Semigroup v`.
So 231 packages at the bottom of dependency graph of lens+servant apps
don't use `Monoid (Map k v)`. Interesting fact, I'd say.


Third step: If I change instance to use `unionWith (<>)`, then the first
failure in our production code is `reducers-3.12.2`

    src/Data/Semigroup/Reducer.hs:233:10: error:
        • Could not deduce (Semigroup v)
            arising from the superclasses of an instance declaration
          from the context: Ord k
            bound by the instance declaration
            at src/Data/Semigroup/Reducer.hs:233:10-42
          Possible fix:
            add (Semigroup v) to the context of the instance declaration
        • In the instance declaration for ‘Reducer (k, v) (Map k v)’
        |
    233 | instance Ord k => Reducer (k, v) (Map k v) where

and `servant-0.13`

    src/Servant/Server/Internal.hs:691:34: error:
        • No instance for (Data.Semigroup.Semigroup
                             (Router' env RoutingApplication))
            arising from a use of ‘mempty’
        • In the first argument of ‘StaticRouter’, namely ‘mempty’
          In the expression: StaticRouter mempty mempty
          In an equation for ‘route’:
              route Proxy _ _ = StaticRouter mempty mempty
        |
    691 |   route Proxy _ _ = StaticRouter mempty mempty

and `swagger2-2.2`.  `SwaggerMonoid` differes for `Maybe`, but has default
implementation in terms of `Monoid`.

    src/Data/Swagger/Internal/Utils.hs:116:10: error:
        • Could not deduce (Data.Semigroup.Semigroup v)
            arising from a use of
‘Data.Swagger.Internal.Utils.$dmswaggerMappend’
          from the context: Ord k
            bound by the instance declaration
            at src/Data/Swagger/Internal/Utils.hs:116:10-41
          Possible fix:
            add (Data.Semigroup.Semigroup v) to the context of
              the instance declaration
        • In the expression:
            Data.Swagger.Internal.Utils.$dmswaggerMappend @Map k v
          In an equation for ‘swaggerMappend’:
              swaggerMappend
                = Data.Swagger.Internal.Utils.$dmswaggerMappend @Map k v
          In the instance declaration for ‘SwaggerMonoid (Map k v)’
        |
    116 | instance Ord k => SwaggerMonoid (Map k v)

And then we get to `futurice-prelude` which also fails:

    src/Futurice/Prelude.hs:177:41: error:
        • Could not deduce (Semigroup v) arising from a use of ‘ifolded’

Funny thing about that code, is that there we work with `Map k (Map l
v)` and
fold them.  For other one we want `UnionWith` and for other any semantic
should
be ok  (we have tests, they *will* break if I get things wrong).

We also have a `Map` wrapper which also breaks loudly.


Fourth step: fix above and contunue investigation where `Monoid (Map k
v)` is
used.  I built one internal application which pulls in even more
dependencies
in addition to what `futurice-prelude` already depends upon (465 deps, e.g.
`amazonka`).

- Two of our internal libraries use `mempty` (second one a lot). Not
breaking.

- One package uses a `Map` wrapper used above (for the sake of experiment I
  passed the `Warning` constraint down)

- The app itself uses `mempty` of `folded` on things which aren't
`Semigroup`.

- `yaml-0.8.28` breaks:

    Data/Yaml/Parser.hs:154:13: error:
        • Could not deduce (Data.Map.Internal.Warning YamlValue)
            arising from a use of ‘await’

  `YamlValue` doesn't have a `Semigroup` (or `Monoid`) instance.

- `amazonka-core-1.5.0`: `mempty`: non silent breakage.

    src/Network/AWS/Data/XML.hs:252:49: error:
        • No instance for (containers-0.5.11.0:Data.Map.Internal.Warning
                             Text)
            arising from a use of ‘mempty’
        • In the second argument of ‘Element’, namely ‘mempty’
          In the first argument of ‘NodeElement’, namely
            ‘(Element (name k) mempty [NodeContent v])’
          In the expression:
            NodeElement (Element (name k) mempty [NodeContent v])
        |
    252 |     node (k, v) = NodeElement (Element (name k) mempty
[NodeContent v])


[1]: https://github.com/futurice/haskell-futurice-prelude
[2]: https://ghc.haskell.org/trac/ghc/ticket/12014#comment:6


On 14.02.2018 01:50, Herbert Valerio Riedel wrote:
> David,
>
>>> Alright, then let's do a little Gedankenexperiment; what if you release
>>> a containers-0.9.0.0 (which drops the instances) and a
>>> containers-1.0.0.0 (which 'adds back' the desired new instances) on the
>>> same day!
>>>
>>> ...wouldn't this allow us to have the cake and eat it too? ;-)
>> It would let us have some cake. Users would be able to test against
>> 0.9, in theory. But they'd have to do it intentionally.
> This is what I was trying to get at: that's always the case. It doesn't
> matter whether you release it same day, a week later, or a year later. I
> have to intentionally *not* skip the 0.9 release. In fact, I'd want to
> skip the 0.9 release myself if possible, as supporting both,
> containers-0.9 and containers-1.0 in the same consumer code is going to
> be awkward enough for me not wanting to do that (or I'd have to
> basically not use the instances at all); If I care about the new
> instances, I'd rather specify `containers ^>= 1.0.0.0` and thus not
> support containers-0.9.
>
> I would have proceeded to point out in my Gedankenexperiment, that in
> order to use such a container-0.9 release, you'd have to force every
> package that depends on `containers` to go through such a 0.9 phase in
> lockstep, which in itself poses an ordeal in terms of package update
> logistic; and then if you spread the time between the 0.9 and the 1.0
> release apart, repeat this dance a 2nd time, with yet another new major
> version bump, which requires yet another round of consumer-code reviews
> (*because* a major version increment was signalled; and we do that as
> API consumers are supposed to pay attention to the major version
> increment signal...)
>
> So consequently, if anything, a container-0.9 release would be a kind of
> an artificial pseudo-release anyway IMO. You could even just condense it
> into a cabal package flag of a containers-1.0 release, which disables
> the Monoid/Semigroup instances; then you don't even need a distinct
> container-0.9 release at all, nor do you have to contaminate the API
> progression with an artificial container-0.9 discontinuity.
>
>> And Stack-based projects would probably need some shenanigans to deal
>> with a version of containers that doesn't come with GHC.
> I'm not convinced we should let the weakest link hold us back. If
> there's limitations in the tooling, we should rather address them, rather
> than contort ourselves; c.f. the Genius Tailor
> ( http://www.wealthculture.cn/infoShow.asp?nid=880&sort=Article%20List )
>
>> So I really think that avoiding these subtle bugs requires at least a
>> full GHC release cycle.
> I don't think waiting a full GHC release would be either necessary nor
> effective at addressing your concerns, which I consider to be disputable
> arguments to begin with.
>
>
> -- hvr
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



More information about the Libraries mailing list