[core libraries] Rename NonEmpty.groupWith/sortWIth to *On*

Carter Schonwald carter.schonwald at gmail.com
Sat Feb 15 13:00:20 UTC 2020


I think I follow.

Pedantic point : when you say semantics, it might be less confusing and
more precise to performance tradeoffs.

My English language and naive preference is to assume *With functions
grammar better.  Or *by. I think the *on idiom is younger relatively.

Can I challenge you to reflect on what challenge  you are trying to address
with this proposal and if there’s a course of action that you feel improves
everything but breaks no code ?  And what would make the change an
improvement for software using these modules already

On Sat, Feb 15, 2020 at 7:36 AM Philip Hazelden <philip.hazelden at gmail.com>
wrote:

> Hi,
>
> > clc gets / reads libraries anyways, no need to double email : )
>
> Fair enough. I tried to follow the process suggested at
> https://wiki.haskell.org/Core_Libraries_Committee by emailing both lists.
> I was surprised that the mail to libraries@ got rejected. No big deal,
> but should that page be updated?
>
> > are we talking about the specifics of the implementations while stilling
> having the same observable pure results, OR something different?
>
> For sortWith/sortOn, yeah, my proposal #3 would change performance
> characteristics but not the pure semantics.
>
> The existing functions that we have are
>
>     Data.List.sortOn f =
>       map snd . sortBy (comparing fst) . map (\x -> let y = f x in y `seq`
> (y, x))
>     GHC.Exts.sortWith f = sortBy (\x y -> compare (f x) (f y))
>     Data.List.NonEmpty.sortWith = sortBy . comparing
>
> Where the first two have type signature `Ord b :: (a -> b) -> [a] -> [a]`
> and the last is `Ord b :: (a -> b) -> NonEmpty a -> NonEmpty a`.
>
> Data.List.sortOn is implemented in a way that calls its projection
> function just once per list element, but it allocates an intermediate list.
> If the projection function is expensive, that'll be worth it. The style of
> the functions named sortWith avoids the intermediate list, but at the cost
> of calling the projection twice for each comparison; so that'll be better
> if the projection is cheap.
>
> I bring this up because issue 12044 suggested removing `GHC.Exts.sortWith`
> for being semantically the same as `Data.List.sortOn`, and the performance
> question was brought up there. To be clear, I don't think it's a big deal.
>
> > please spell this out more concretely, i'm not able to decode what you
> mean unambiguiously. Just changing the names of some functions in base in
> isolation breaks code for no reason,
>
> My proposals #1 and #2 are just to change the names of functions, yeah.
> But it's not for no reason.
>
> There are two problems with the current names. The smaller one, applying
> to all of these functions, is that I believe the "On" pattern is more
> widely used. And so if a user tries to guess what these functions will be
> called, they'll most likely go for that.
>
> (I don't have any hard data here, admittedly. We see "On" in
> Data.List.sortOn, but also many functions in Data.List.Extra like `nubOn`,
> `groupOn`, `maximumOn`. Admittedly that's not in base. But I'll also note
> that NonEmpty.sortWith was originally called sortOn, and the groupWith
> functions were originally requested with the name groupOn; see
> https://github.com/ekmett/semigroups/pull/52. So yeah, my sense is that
> "On" is what most people will expect.)
>
> The more significant one, applying to the group functions but not to
> sortWith, is that the only other function in base named `groupWith` has a
> comparable type signature but different semantics.
> Data.List.NonEmpty.groupWith is "group according to a projection function"
> while GHC.Exts.groupWith is "sort and group according to a projection
> function". If someone gets these behaviours confused, they're going to sort
> when they don't intend to, or to not sort when they do intend to. Using
> more consistent naming may help avoid that.
>
> It may be that this isn't enough reason to break code. If the committee
> decides that it's not, then fair enough.
>
> Though, it now occurs to me that renaming `GHC.Exts.groupWith` to
> `groupAllOn` or `groupAllWith` or something (in Data.List.Extra it's named
> "groupSortOn"), could also help clear up the larger problem while breaking
> less code. So I'll offer that possibility for consideration too.
>
>
>
> On Thu, Feb 13, 2020 at 10:17 PM Carter Schonwald <
> carter.schonwald at gmail.com> wrote:
>
>> Hello Philip, clc gets / reads libraries anyways, no need to double email
>> : )
>>
>> perhaps i'm missing something, are we talking about the specifics of the
>> implementations while stilling having the same observable pure results, OR
>> something different?
>>
>> When does each have a clear performance win?
>>
>> at least from my native english speaker perspective, sortWith is better
>> grammar.  Are you describing a choice of performance characteristics
>> related to  *On vs *With functions?
>>
>> also, could you spell out which reference implementations of which
>> convention enshrine which meanings you have in mind?
>>
>> I dont see a good reason to change the names that are provided from one
>> to the other. Are you merely making the case that both styles should be
>> exported from certain places, and that a particular choice of evaluation
>> trade offs be enshrined by the respective naming convention?
>>
>>
>> please spell this out more concretely, i'm not able to decode what you
>> mean unambiguiously. Just changing the names of some functions in base in
>> isolation breaks code for no reason, so if you are arguing for that, i'm
>> gonna say nope :)
>>
>> On Wed, Feb 12, 2020 at 12:58 PM Philip Hazelden <
>> philip.hazelden at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm referring to these functions: `sortWith` `groupWith` `groupAllWith`
>>> `groupWith1` `groupAllWith1`.
>>>
>>> The `On` suffix is more common for such things, and I anticipate it's
>>> what most users will normally expect them to be named.
>>>
>>> Additionally, the name `groupWith` is potentially misleading. That name
>>> is also used in GHC.Exts[1], for a function with the same type signature
>>> but different semantics. `GHC.Exts.groupWith` sorts its input list, unlike
>>> `NonEmpty.groupWith` or `Data.List.Extra.groupOn`. (In NonEmpty, we have
>>> `groupAllWith` for group-with-sort. So we have three names, two semantics,
>>> and no consistency.)
>>>
>>> According to https://github.com/ekmett/semigroups/pull/52, `With` was
>>> chosen because:
>>>
>>> > The preferred vocabulary for On here is With to match the combinators
>>> in GHC.Exts from the "comprehensive comprehensions" paper. That'd make it
>>> so that if you use these with comprehensive comprehensions turned on and
>>> RebindableSyntax turned on then you'd get these combinators.
>>>
>>> But I don't see anything in the docs which suggests that the
>>> TransformListComp extension[2] uses these functions implicitly, or
>>> interacts with RebindableSyntax[3]. So perhaps I'm missing something, but
>>> my guess is this is no longer a concern.
>>>
>>> The case for `sortWith` is weaker, since it might trip people up but it
>>> won't introduce unexpected semantics. There's also a counter argument in
>>> https://gitlab.haskell.org/ghc/ghc/issues/12044. `Data.List.sortOn`
>>> only computes the mapping function once for each element, while
>>> `GHC.Exts.sortWith` computes it every time it does a comparison but uses
>>> fewer allocations. `Data.NonEmpty.sortWith` acts like the latter. My
>>> suggestion would be to replace it with a `sortOn` implemented like
>>> `Data.List.sortOn`, but I also don't think it would be terrible to have
>>> both, or even to just rename and leave this small inconsistency. If there's
>>> no agreement on this function, I think it would be worth renaming the group
>>> functions anyway.
>>>
>>> And so I propose, in descending order of how much I care:
>>>
>>> 1. Rename the group functions to be named "On" instead of "With".
>>>
>>> 2. Rename `sortWith` to `sortOn` In the case of `sortWith`, also
>>> reimplement it as
>>>
>>> 3. Reimplement `sortOn` as
>>>
>>>     sortOn :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
>>>     sortOn = fmap snd . sortBy (comparing fst) . fmap (\x -> let y = f x
>>> in y `seq` (y, x))
>>>
>>> I assume the process here would be to leave the existing names for a
>>> time with deprecation warnings, and then remove them a few versions down
>>> the line.
>>>
>>> [1]
>>> https://hackage.haskell.org/package/base-4.12.0.0/docs/GHC-Exts.html#v:groupWith
>>> [2]
>>> https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#extension-TransformListComp
>>> [3]
>>> https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#extension-RebindableSyntax
>>>
>>> Best,
>>> Phil
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "haskell-core-libraries" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to haskell-core-libraries+unsubscribe at googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/haskell-core-libraries/CALB5dS-5qKjK6rmvFg2TuZ_3HvPtPyuGzsTrVO8adAnX1zovdg%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/haskell-core-libraries/CALB5dS-5qKjK6rmvFg2TuZ_3HvPtPyuGzsTrVO8adAnX1zovdg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200215/eeef1d56/attachment.html>


More information about the Libraries mailing list