Add NonEmptyMap and NonEmptySet to containers
David Feuer
david.feuer at gmail.com
Thu Apr 25 18:04:33 UTC 2019
Oh, I see what you mean. But then containers would depend on GADTs, which
is rather counter to tradition. And I don't really see the benefit over a
newtype and smart constructors around the existing Seq type.
On Thu, Apr 25, 2019, 1:41 PM Zemyla <zemyla at gmail.com> wrote:
> We could have a GADT for the FingerTree:
>
> data Emptiness = E | NE
>
> data FingerTree e a where
> EmptyT :: FingerTree 'E a
> SingleT :: a -> FingerTree e a
> DeepT :: !(Digit a) -> FingerTree e' (Node a) -> !(Digit a) ->
> FingerTree e a
>
> And because Seq is a newtype wrapper around FingerTree (Elem a) anyway,
> Seq and NonEmptySeq would become newtype wrappers around FingerTree 'E
> (Elem a) and FingerTree 'NE (Elem a). This avoids the majority of the code
> duplication.
>
> On Thu, Apr 25, 2019, 12:34 John Ericson <john.ericson at obsidian.systems>
> wrote:
>
>> Thanks for the clarification, David. Sounds like a good wait-and-see
>> thing: after this lands, Zemyla's proposal and the experience with
>> NonEmpty{Map,Set} can inform what Seq should end up looking like.
>>
>> John
>> On 4/25/19 1:28 PM, David Feuer wrote:
>>
>> No, a Seq can be Empty at the bottom too. It's definitely not a mutually
>> recursive situation like Set or Map, and it's not immediately a top-level
>> one like IntSet or IntMap. A nonempty sequence type would need somewhat
>> different functions for everything. In most cases, the changes seem fairly
>> straightforward. I'm not sure about weird functions like inits and tails.
>> Some extra care might be required for replicate.
>>
>> I believe it would be possible to restructure things to make the
>> distinction a top-level one, by using a possibly-empty type for the top and
>> a nonempty one below. Zemyla has already identified some very solid
>> unrelated reasons to want to separate the tops from the rest, but I'm
>> somewhat concerned about source code duplication with that general
>> approach. Even if we do that, it's not clear to me that we can make a
>> non-empty/possibly-empty distinction without incurring a performance
>> penalty.
>>
>> On Thu, Apr 25, 2019, 1:17 PM John Ericson
>> <john.ericson at obsidian.systems> <john.ericson at obsidian.systems> wrote:
>>
>>> I haven't looked into `Seq` in addition to `Map` and `Set`, just
>>> `IntSet` and `IntMap`. But it might be a similar thing? I take it that
>>> with `Seq` today only the root can be empty and everything else is
>>> single or deep? That means we *would* just use a single Maybe-like thing
>>> at the top level, no mutual recursion. But on the other hand to make
>>> that to make that work efficiently we would would need GHC to support
>>> unboxing sums, so 1 + 2 variants can become a flat 3.
>>>
>>> Also, https://github.com/haskell/containers/pull/616 is now where the
>>> actual implementation is happening, not just the datatype changes as
>>> before. Feel free to comment on the concrete work in progress, everyone!
>>>
>>> John
>>>
>>> On 4/25/19 11:36 AM, Zemyla wrote:
>>> > A Seq has either Empty, Single, or Deep. A NonEmptySeq would have just
>>> > Single or Deep.
>>> >
>>> > On Thu, Apr 25, 2019, 09:55 David Feuer <david.feuer at gmail.com
>>> > <mailto:david.feuer at gmail.com>> wrote:
>>> >
>>> > I don't see the benefit there, unless you see a way to work it
>>> > into the representation.
>>> >
>>> > On Thu, Apr 25, 2019, 10:53 AM Zemyla <zemyla at gmail.com
>>> > <mailto:zemyla at gmail.com>> wrote:
>>> >
>>> > As long as we're doing this, can we also add NonEmptySeq as
>>> well?
>>> >
>>> > On Thu, Apr 25, 2019, 09:11 Artyom Kazak <yom at artyom.me
>>> > <mailto:yom at artyom.me>> wrote:
>>> >
>>> > I'm -1 on any kind of |Map = NEMap|.
>>> >
>>> > An ordinary map and a non-empty map are semantically
>>> > different. I believe that if I non-empty maps were already
>>> > in |containers|, I would pretty much always care whether a
>>> > |Map| I see in code is a 0-map or 1-map.
>>> >
>>> > Similarly, I prefer |Int| and |Word| instead of |Int| and
>>> > |Unsigned.Int|. (Luckily that's already the case.)
>>> >
>>> > We already have a precedent with |Text| and |ByteString|,
>>> > where the lazy and the strict versions are only
>>> > distinguished by the module prefix. In my experience,
>>> > modules where both are used are pretty common, and I end
>>> > up just introducing |type LByteString = Lazy.ByteString|
>>> > in all my projects, because otherwise I need to scroll to
>>> > the imports section whenever I need to know which flavor
>>> > of bytestring is being used. (Or if I'm reading haddocks,
>>> > I have to look at the link because Haddock hides module
>>> > prefixes.)
>>> >
>>> > "why not both" is even worse. I still can't trust the
>>> > |Map|, but now I also have to learn and remember that two
>>> > modules are the same. Speaking from experience again –
>>> > most people seem to be surprised by the fact that
>>> > |Data.Map.Lazy| and |Data.Map.Strict| export the same
>>> > |Map| type. The proposed module hierarchy would move
>>> > |containers| to the top of my "packages that confuse
>>> > beginners" list, beating even |aeson|.
>>> >
>>> > As an aside, I wish we had a proper interface for
>>> > container-like structures, or at least a solution to name
>>> > scoping. I really like the way Rust does it, for instance,
>>> > where certain functions can be "attached" to a type – I'm
>>> > hesitant to call them "methods" because Rust is not an OOP
>>> > language.
>>> > On Apr 25 2019, at 2:49 pm, Mario Blažević
>>> > <mblazevic at stilo.com <mailto:mblazevic at stilo.com>> wrote:
>>> >
>>> > On 2019-04-18 11:00 p.m., David Feuer wrote:
>>> >
>>> > I'm in favor of the proposal. I find the
>>> > isomorphism between Map (a,b) v
>>> > and Map a (NonemptyMap b v) very pleasant. The
>>> > fact that others have
>>> > written less-performant implementations of this
>>> > idea is rather
>>> > convincing. The fact that doing this removes
>>> > partial matches in the
>>> > implementation is nice. And I'll take performance
>>> > improvements where I
>>> > can get them. The main question is the proper name
>>> > of the type. Just
>>> > Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>>> > empty be capitalized?
>>> >
>>> >
>>> > There seems to be a consensus for
>>> > Data.Map.NonEmpty.NEMap, with the
>>> > type and the functions slightly off the regular ones.
>>> > This design would
>>> > make it easier to use regular and non-empty containers
>>> > together, but it
>>> > be annoying for the use case of replacing all uses of
>>> > an existing
>>> > regular container with a non-empty one. I'd rather
>>> > change just the
>>> > import declaration than all occurrences of the type
>>> > name and functions.
>>> >
>>> > I don't want to derail the implementation with
>>> > bikeshedding, so I'm
>>> > just going to ask why not both? The library can both
>>> > export the tweaked
>>> > names and add a module, say Data.NonEmpty.Map.Lazy,
>>> > that exports the
>>> > type synonym Map = NEMap. It would also rename all the
>>> > functions back to
>>> > their names from Data.Map.Lazy.
>>> >
>>> >
>>> >
>>> > On Thu, Apr 18, 2019, 7:15 PM John Cotton Ericson
>>> > <John.Ericson at obsidian.systems>
>>> <John.Ericson at obsidian.systems> wrote:
>>> >
>>> > In
>>> > https://github.com/haskell/containers/issues/608 I
>>> > proposed
>>> > adding non-empty variants of Map and Set,
>>> analogous to
>>> > Data.List.NonEmpty for List, to containers.
>>> > semigroupoids
>>> > demonstrates the many uses and structure of
>>> > non-empty containers in
>>> > general, and libraries such as
>>> > https://github.com/mstksg/nonempty-containers and
>>> > https://github.com/andrewthad/non-empty-containers
>>> > demonstrate the
>>> > interest in non-empty maps and sets in particular.
>>> > My favorite
>>> > use-case is that they're needed to "curry"
>>> > containers: for example,
>>> > |Map (k0, k1) v| is isomorphic not to |Map k0 (Map
>>> > k1 v)| but to
>>> > |Map k0 (NonEmptyMap k1 v)|. I like this use-case
>>> > because it comes
>>> > from the containers themselves.
>>> >
>>> > Importantly, there's no good way to do this
>>> > outside of containers;
>>> > doing so leads to imbalancing / extra indirection,
>>> > or massive code
>>> > duplication. If one wraps the container was an
>>> > extra value like
>>> > Data.List.NonEmpty, one's left with an unavoidable
>>> > extra
>>> > indirection/imbalance. One can rectify this by
>>> > copying and modifying
>>> > the implementation of containers, but that's
>>> > hardly maintainable;
>>> > even as though the algorithms are the same, enough
>>> > lines are touched
>>> > that merging upstream containers is nigh
>>> impossible.
>>> >
>>> > On the other hand, the non-empty containers can be
>>> > elegantly and
>>> > sufficiently implemented alongside their originals
>>> > by taking the Bin
>>> > constructor and breaking it out into it's own
>>> > type, mutually
>>> > recursive with the original. This avoids the
>>> > indirection/imbalancing
>>> > and code duplication problems: the algorithms work
>>> > exactly as before
>>> > creating the same trees (remember the UNPACK), and
>>> > no code
>>> > duplicated since the functions become mutually
>>> > recursive matching
>>> > the types.
>>> >
>>> > To briefly summarize the thread:
>>> >
>>> > 1. I proposed the issue after performing this same
>>> > refactor on the
>>> > dependent-map package:
>>> >
>>> https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>>> > a fork of containers.
>>> > 2. I made
>>> > https://github.com/haskell/containers/pull/616
>>> > which just
>>> > changes the types, to make sure UNPACK preserved
>>> > the importance.
>>> > 3.
>>> >
>>> https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>>> > the benchmarks showed rather than degrading
>>> > performance, PR 616
>>> > actually /improved/ it.
>>> >
>>> > If there is preliminary consensus, I'll make a
>>> > second PR on top
>>> > which generalizes the functions like on my
>>> > dependent-map branch.
>>> >
>>> > Thanks,
>>> >
>>> > John
>>> >
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries at haskell.org
>>> > <mailto:Libraries at haskell.org>
>>> > <mailto:Libraries at haskell.org
>>> > <mailto:Libraries at haskell.org>>
>>> >
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>> >
>>> >
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries at haskell.org <mailto:
>>> Libraries at haskell.org>
>>> >
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>> >
>>> >
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>> >
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>> >
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>> >
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>> >
>>> >
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries at haskell.org
>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20190425/29d25a2e/attachment.html>
More information about the Libraries
mailing list