Add NonEmptyMap and NonEmptySet to containers

John Ericson john.ericson at obsidian.systems
Thu Apr 25 17:34:23 UTC 2019


Thanks for the clarification, David. Sounds like a good wait-and-see 
thing: after this lands, Zemyla's proposal and the experience with 
NonEmpty{Map,Set} can inform what Seq should end up looking like.

John

On 4/25/19 1:28 PM, David Feuer wrote:
> No, a Seq can be Empty at the bottom too. It's definitely not a 
> mutually recursive situation like Set or Map, and it's not immediately 
> a top-level one like IntSet or IntMap. A nonempty sequence type would 
> need somewhat different functions for everything. In most cases, the 
> changes seem fairly straightforward. I'm not sure about weird 
> functions like inits and tails. Some extra care might be required for 
> replicate.
>
> I believe it would be possible to restructure things to make the 
> distinction a top-level one, by using a possibly-empty type for the 
> top and a nonempty one below. Zemyla has already identified some very 
> solid unrelated reasons to want to separate the tops from the rest, 
> but I'm somewhat concerned about source code duplication with that 
> general approach. Even if we do that, it's not clear to me that we can 
> make a non-empty/possibly-empty distinction without incurring a 
> performance penalty.
>
> On Thu, Apr 25, 2019, 1:17 PM John Ericson 
> <john.ericson at obsidian.systems> wrote:
>
>     I haven't looked into `Seq` in addition to `Map` and `Set`, just
>     `IntSet` and `IntMap`. But it might be a similar thing? I take it
>     that
>     with `Seq` today only the root can be empty and everything else is
>     single or deep? That means we *would* just use a single Maybe-like
>     thing
>     at the top level, no mutual recursion. But on the other hand to make
>     that to make that work efficiently we would would need GHC to support
>     unboxing sums, so 1 + 2 variants can become a flat 3.
>
>     Also, https://github.com/haskell/containers/pull/616 is now where the
>     actual implementation is happening, not just the datatype changes as
>     before. Feel free to comment on the concrete work in progress,
>     everyone!
>
>     John
>
>     On 4/25/19 11:36 AM, Zemyla wrote:
>     > A Seq has either Empty, Single, or Deep. A NonEmptySeq would
>     have just
>     > Single or Deep.
>     >
>     > On Thu, Apr 25, 2019, 09:55 David Feuer <david.feuer at gmail.com
>     <mailto:david.feuer at gmail.com>
>     > <mailto:david.feuer at gmail.com <mailto:david.feuer at gmail.com>>>
>     wrote:
>     >
>     >     I don't see the benefit there, unless you see a way to work it
>     >     into the representation.
>     >
>     >     On Thu, Apr 25, 2019, 10:53 AM Zemyla <zemyla at gmail.com
>     <mailto:zemyla at gmail.com>
>     >     <mailto:zemyla at gmail.com <mailto:zemyla at gmail.com>>> wrote:
>     >
>     >         As long as we're doing this, can we also add NonEmptySeq
>     as well?
>     >
>     >         On Thu, Apr 25, 2019, 09:11 Artyom Kazak <yom at artyom.me
>     <mailto:yom at artyom.me>
>     >         <mailto:yom at artyom.me <mailto:yom at artyom.me>>> wrote:
>     >
>     >             I'm -1 on any kind of |Map = NEMap|.
>     >
>     >             An ordinary map and a non-empty map are semantically
>     >             different. I believe that if I non-empty maps were
>     already
>     >             in |containers|, I would pretty much always care
>     whether a
>     >             |Map| I see in code is a 0-map or 1-map.
>     >
>     >             Similarly, I prefer |Int| and |Word| instead of
>     |Int| and
>     >             |Unsigned.Int|. (Luckily that's already the case.)
>     >
>     >             We already have a precedent with |Text| and
>     |ByteString|,
>     >             where the lazy and the strict versions are only
>     >             distinguished by the module prefix. In my experience,
>     >             modules where both are used are pretty common, and I end
>     >             up just introducing |type LByteString = Lazy.ByteString|
>     >             in all my projects, because otherwise I need to
>     scroll to
>     >             the imports section whenever I need to know which flavor
>     >             of bytestring is being used. (Or if I'm reading
>     haddocks,
>     >             I have to look at the link because Haddock hides module
>     >             prefixes.)
>     >
>     >             "why not both" is even worse. I still can't trust the
>     >             |Map|, but now I also have to learn and remember
>     that two
>     >             modules are the same. Speaking from experience again –
>     >             most people seem to be surprised by the fact that
>     >             |Data.Map.Lazy| and |Data.Map.Strict| export the same
>     >             |Map| type. The proposed module hierarchy would move
>     >             |containers| to the top of my "packages that confuse
>     >             beginners" list, beating even |aeson|.
>     >
>     >             As an aside, I wish we had a proper interface for
>     >             container-like structures, or at least a solution to
>     name
>     >             scoping. I really like the way Rust does it, for
>     instance,
>     >             where certain functions can be "attached" to a type
>     – I'm
>     >             hesitant to call them "methods" because Rust is not
>     an OOP
>     >             language.
>     >             On Apr 25 2019, at 2:49 pm, Mario Blažević
>     >             <mblazevic at stilo.com <mailto:mblazevic at stilo.com>
>     <mailto:mblazevic at stilo.com <mailto:mblazevic at stilo.com>>> wrote:
>     >
>     >                 On 2019-04-18 11:00 p.m., David Feuer wrote:
>     >
>     >                     I'm in favor of the proposal. I find the
>     >                     isomorphism between Map (a,b) v
>     >                     and Map a (NonemptyMap b v) very pleasant. The
>     >                     fact that others have
>     >                     written less-performant implementations of this
>     >                     idea is rather
>     >                     convincing. The fact that doing this removes
>     >                     partial matches in the
>     >                     implementation is nice. And I'll take
>     performance
>     >                     improvements where I
>     >                     can get them. The main question is the
>     proper name
>     >                     of the type. Just
>     >                     Data.Map.Nonempty.Map, or .NonemptyMap?
>     Should the
>     >                     empty be capitalized?
>     >
>     >
>     >                 There seems to be a consensus for
>     >                 Data.Map.NonEmpty.NEMap, with the
>     >                 type and the functions slightly off the regular
>     ones.
>     >                 This design would
>     >                 make it easier to use regular and non-empty
>     containers
>     >                 together, but it
>     >                 be annoying for the use case of replacing all
>     uses of
>     >                 an existing
>     >                 regular container with a non-empty one. I'd rather
>     >                 change just the
>     >                 import declaration than all occurrences of the type
>     >                 name and functions.
>     >
>     >                 I don't want to derail the implementation with
>     >                 bikeshedding, so I'm
>     >                 just going to ask why not both? The library can both
>     >                 export the tweaked
>     >                 names and add a module, say Data.NonEmpty.Map.Lazy,
>     >                 that exports the
>     >                 type synonym Map = NEMap. It would also rename
>     all the
>     >                 functions back to
>     >                 their names from Data.Map.Lazy.
>     >
>     >
>     >
>     >                     On Thu, Apr 18, 2019, 7:15 PM John Cotton
>     Ericson
>     >                     <John.Ericson at obsidian.systems> wrote:
>     >
>     >                     In
>     > https://github.com/haskell/containers/issues/608 I
>     >                     proposed
>     >                     adding non-empty variants of Map and Set,
>     analogous to
>     >                     Data.List.NonEmpty for List, to containers.
>     >                     semigroupoids
>     >                     demonstrates the many uses and structure of
>     >                     non-empty containers in
>     >                     general, and libraries such as
>     > https://github.com/mstksg/nonempty-containers and
>     > https://github.com/andrewthad/non-empty-containers
>     >                     demonstrate the
>     >                     interest in non-empty maps and sets in
>     particular.
>     >                     My favorite
>     >                     use-case is that they're needed to "curry"
>     >                     containers: for example,
>     >                     |Map (k0, k1) v| is isomorphic not to |Map
>     k0 (Map
>     >                     k1 v)| but to
>     >                     |Map k0 (NonEmptyMap k1 v)|. I like this
>     use-case
>     >                     because it comes
>     >                     from the containers themselves.
>     >
>     >                     Importantly, there's no good way to do this
>     >                     outside of containers;
>     >                     doing so leads to imbalancing / extra
>     indirection,
>     >                     or massive code
>     >                     duplication. If one wraps the container was an
>     >                     extra value like
>     >                     Data.List.NonEmpty, one's left with an
>     unavoidable
>     >                     extra
>     >                     indirection/imbalance. One can rectify this by
>     >                     copying and modifying
>     >                     the implementation of containers, but that's
>     >                     hardly maintainable;
>     >                     even as though the algorithms are the same,
>     enough
>     >                     lines are touched
>     >                     that merging upstream containers is nigh
>     impossible.
>     >
>     >                     On the other hand, the non-empty containers
>     can be
>     >                     elegantly and
>     >                     sufficiently implemented alongside their
>     originals
>     >                     by taking the Bin
>     >                     constructor and breaking it out into it's own
>     >                     type, mutually
>     >                     recursive with the original. This avoids the
>     >                     indirection/imbalancing
>     >                     and code duplication problems: the
>     algorithms work
>     >                     exactly as before
>     >                     creating the same trees (remember the
>     UNPACK), and
>     >                     no code
>     >                     duplicated since the functions become mutually
>     >                     recursive matching
>     >                     the types.
>     >
>     >                     To briefly summarize the thread:
>     >
>     >                     1. I proposed the issue after performing
>     this same
>     >                     refactor on the
>     >                     dependent-map package:
>     > https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>     >                     a fork of containers.
>     >                     2. I made
>     > https://github.com/haskell/containers/pull/616
>     >                     which just
>     >                     changes the types, to make sure UNPACK preserved
>     >                     the importance.
>     >                     3.
>     > https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>     >                     the benchmarks showed rather than degrading
>     >                     performance, PR 616
>     >                     actually /improved/ it.
>     >
>     >                      If there is preliminary consensus, I'll make a
>     >                     second PR on top
>     >                     which generalizes the functions like on my
>     >                     dependent-map branch.
>     >
>     >                     Thanks,
>     >
>     >                     John
>     >
>     >  _______________________________________________
>     >                     Libraries mailing list
>     > Libraries at haskell.org <mailto:Libraries at haskell.org>
>     >                     <mailto:Libraries at haskell.org
>     <mailto:Libraries at haskell.org>>
>     >                     <mailto:Libraries at haskell.org
>     <mailto:Libraries at haskell.org>
>     >                     <mailto:Libraries at haskell.org
>     <mailto:Libraries at haskell.org>>>
>     > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>     >
>     >
>     >  _______________________________________________
>     >                     Libraries mailing list
>     > Libraries at haskell.org <mailto:Libraries at haskell.org>
>     <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>     > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>     >
>     >
>     >  _______________________________________________
>     >                 Libraries mailing list
>     > Libraries at haskell.org <mailto:Libraries at haskell.org>
>     <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>     > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>     >
>     >  _______________________________________________
>     >             Libraries mailing list
>     > Libraries at haskell.org <mailto:Libraries at haskell.org>
>     <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>     > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>     >
>     >         _______________________________________________
>     >         Libraries mailing list
>     > Libraries at haskell.org <mailto:Libraries at haskell.org>
>     <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>     > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>     >
>     >
>     > _______________________________________________
>     > Libraries mailing list
>     > Libraries at haskell.org <mailto:Libraries at haskell.org>
>     > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20190425/4344215c/attachment.html>


More information about the Libraries mailing list