Add NonEmptyMap and NonEmptySet to containers

John Ericson john.ericson at obsidian.systems
Thu Apr 25 18:10:16 UTC 2019


I think the concrete benefit to GADTs is that the tags are not 
monomorphized wrt the equality constraints. So I'd view this or the 
unpacking sums things as two ways to end up with the 3 variant tag and 
no extra indirection or partiality. So for those goals, it's, adopt 
GADTs or modify GHC, pick your poison! :)

John

On 4/25/19 2:04 PM, David Feuer wrote:
> Oh, I see what you mean. But then containers would depend on GADTs, 
> which is rather counter to tradition. And I don't really see the 
> benefit over a newtype and smart constructors around the existing Seq 
> type.
>
> On Thu, Apr 25, 2019, 1:41 PM Zemyla <zemyla at gmail.com 
> <mailto:zemyla at gmail.com>> wrote:
>
>     We could have a GADT for the FingerTree:
>
>     data Emptiness = E | NE
>
>     data FingerTree e a where
>       EmptyT :: FingerTree 'E a
>       SingleT :: a -> FingerTree e a
>       DeepT :: !(Digit a) -> FingerTree e' (Node a) -> !(Digit a) ->
>     FingerTree e a
>
>     And because Seq is a newtype wrapper around FingerTree (Elem a)
>     anyway, Seq and NonEmptySeq would become newtype wrappers around
>     FingerTree 'E (Elem a) and FingerTree 'NE (Elem a). This avoids
>     the majority of the code duplication.
>
>     On Thu, Apr 25, 2019, 12:34 John Ericson
>     <john.ericson at obsidian.systems> wrote:
>
>         Thanks for the clarification, David. Sounds like a good
>         wait-and-see thing: after this lands, Zemyla's proposal and
>         the experience with NonEmpty{Map,Set} can inform what Seq
>         should end up looking like.
>
>         John
>
>         On 4/25/19 1:28 PM, David Feuer wrote:
>>         No, a Seq can be Empty at the bottom too. It's definitely not
>>         a mutually recursive situation like Set or Map, and it's not
>>         immediately a top-level one like IntSet or IntMap. A nonempty
>>         sequence type would need somewhat different functions for
>>         everything. In most cases, the changes seem fairly
>>         straightforward. I'm not sure about weird functions like
>>         inits and tails. Some extra care might be required for
>>         replicate.
>>
>>         I believe it would be possible to restructure things to make
>>         the distinction a top-level one, by using a possibly-empty
>>         type for the top and a nonempty one below. Zemyla has already
>>         identified some very solid unrelated reasons to want to
>>         separate the tops from the rest, but I'm somewhat concerned
>>         about source code duplication with that general approach.
>>         Even if we do that, it's not clear to me that we can make a
>>         non-empty/possibly-empty distinction without incurring a
>>         performance penalty.
>>
>>         On Thu, Apr 25, 2019, 1:17 PM John Ericson
>>         <john.ericson at obsidian.systems>
>>         <mailto:john.ericson at obsidian.systems> wrote:
>>
>>             I haven't looked into `Seq` in addition to `Map` and
>>             `Set`, just
>>             `IntSet` and `IntMap`. But it might be a similar thing? I
>>             take it that
>>             with `Seq` today only the root can be empty and
>>             everything else is
>>             single or deep? That means we *would* just use a single
>>             Maybe-like thing
>>             at the top level, no mutual recursion. But on the other
>>             hand to make
>>             that to make that work efficiently we would would need
>>             GHC to support
>>             unboxing sums, so 1 + 2 variants can become a flat 3.
>>
>>             Also, https://github.com/haskell/containers/pull/616 is
>>             now where the
>>             actual implementation is happening, not just the datatype
>>             changes as
>>             before. Feel free to comment on the concrete work in
>>             progress, everyone!
>>
>>             John
>>
>>             On 4/25/19 11:36 AM, Zemyla wrote:
>>             > A Seq has either Empty, Single, or Deep. A NonEmptySeq
>>             would have just
>>             > Single or Deep.
>>             >
>>             > On Thu, Apr 25, 2019, 09:55 David Feuer
>>             <david.feuer at gmail.com <mailto:david.feuer at gmail.com>
>>             > <mailto:david.feuer at gmail.com
>>             <mailto:david.feuer at gmail.com>>> wrote:
>>             >
>>             >     I don't see the benefit there, unless you see a way
>>             to work it
>>             >     into the representation.
>>             >
>>             >     On Thu, Apr 25, 2019, 10:53 AM Zemyla
>>             <zemyla at gmail.com <mailto:zemyla at gmail.com>
>>             >     <mailto:zemyla at gmail.com
>>             <mailto:zemyla at gmail.com>>> wrote:
>>             >
>>             >         As long as we're doing this, can we also add
>>             NonEmptySeq as well?
>>             >
>>             >         On Thu, Apr 25, 2019, 09:11 Artyom Kazak
>>             <yom at artyom.me <mailto:yom at artyom.me>
>>             >         <mailto:yom at artyom.me <mailto:yom at artyom.me>>>
>>             wrote:
>>             >
>>             >             I'm -1 on any kind of |Map = NEMap|.
>>             >
>>             >             An ordinary map and a non-empty map are
>>             semantically
>>             >             different. I believe that if I non-empty
>>             maps were already
>>             >             in |containers|, I would pretty much always
>>             care whether a
>>             >             |Map| I see in code is a 0-map or 1-map.
>>             >
>>             >             Similarly, I prefer |Int| and |Word|
>>             instead of |Int| and
>>             >             |Unsigned.Int|. (Luckily that's already the
>>             case.)
>>             >
>>             >             We already have a precedent with |Text| and
>>             |ByteString|,
>>             >             where the lazy and the strict versions are only
>>             >             distinguished by the module prefix. In my
>>             experience,
>>             >             modules where both are used are pretty
>>             common, and I end
>>             >             up just introducing |type LByteString =
>>             Lazy.ByteString|
>>             >             in all my projects, because otherwise I
>>             need to scroll to
>>             >             the imports section whenever I need to know
>>             which flavor
>>             >             of bytestring is being used. (Or if I'm
>>             reading haddocks,
>>             >             I have to look at the link because Haddock
>>             hides module
>>             >             prefixes.)
>>             >
>>             >             "why not both" is even worse. I still can't
>>             trust the
>>             >             |Map|, but now I also have to learn and
>>             remember that two
>>             >             modules are the same. Speaking from
>>             experience again –
>>             >             most people seem to be surprised by the
>>             fact that
>>             >             |Data.Map.Lazy| and |Data.Map.Strict|
>>             export the same
>>             >             |Map| type. The proposed module hierarchy
>>             would move
>>             >             |containers| to the top of my "packages
>>             that confuse
>>             >             beginners" list, beating even |aeson|.
>>             >
>>             >             As an aside, I wish we had a proper
>>             interface for
>>             >             container-like structures, or at least a
>>             solution to name
>>             >             scoping. I really like the way Rust does
>>             it, for instance,
>>             >             where certain functions can be "attached"
>>             to a type – I'm
>>             >             hesitant to call them "methods" because
>>             Rust is not an OOP
>>             >             language.
>>             >             On Apr 25 2019, at 2:49 pm, Mario Blažević
>>             >             <mblazevic at stilo.com
>>             <mailto:mblazevic at stilo.com> <mailto:mblazevic at stilo.com
>>             <mailto:mblazevic at stilo.com>>> wrote:
>>             >
>>             >                 On 2019-04-18 11:00 p.m., David Feuer
>>             wrote:
>>             >
>>             >                     I'm in favor of the proposal. I
>>             find the
>>             >                     isomorphism between Map (a,b) v
>>             >                     and Map a (NonemptyMap b v) very
>>             pleasant. The
>>             >                     fact that others have
>>             >                     written less-performant
>>             implementations of this
>>             >                     idea is rather
>>             >                     convincing. The fact that doing
>>             this removes
>>             >                     partial matches in the
>>             >                     implementation is nice. And I'll
>>             take performance
>>             >                     improvements where I
>>             >                     can get them. The main question is
>>             the proper name
>>             >                     of the type. Just
>>             >  Data.Map.Nonempty.Map, or .NonemptyMap? Should the
>>             >                     empty be capitalized?
>>             >
>>             >
>>             >                 There seems to be a consensus for
>>             >                 Data.Map.NonEmpty.NEMap, with the
>>             >                 type and the functions slightly off the
>>             regular ones.
>>             >                 This design would
>>             >                 make it easier to use regular and
>>             non-empty containers
>>             >                 together, but it
>>             >                 be annoying for the use case of
>>             replacing all uses of
>>             >                 an existing
>>             >                 regular container with a non-empty one.
>>             I'd rather
>>             >                 change just the
>>             >                 import declaration than all occurrences
>>             of the type
>>             >                 name and functions.
>>             >
>>             >                 I don't want to derail the
>>             implementation with
>>             >                 bikeshedding, so I'm
>>             >                 just going to ask why not both? The
>>             library can both
>>             >                 export the tweaked
>>             >                 names and add a module, say
>>             Data.NonEmpty.Map.Lazy,
>>             >                 that exports the
>>             >                 type synonym Map = NEMap. It would also
>>             rename all the
>>             >                 functions back to
>>             >                 their names from Data.Map.Lazy.
>>             >
>>             >
>>             >
>>             >                     On Thu, Apr 18, 2019, 7:15 PM John
>>             Cotton Ericson
>>             > <John.Ericson at obsidian.systems>
>>             <mailto:John.Ericson at obsidian.systems> wrote:
>>             >
>>             >                     In
>>             > https://github.com/haskell/containers/issues/608 I
>>             >                     proposed
>>             >                     adding non-empty variants of Map
>>             and Set, analogous to
>>             >                     Data.List.NonEmpty for List, to
>>             containers.
>>             >                     semigroupoids
>>             >                     demonstrates the many uses and
>>             structure of
>>             >                     non-empty containers in
>>             >                     general, and libraries such as
>>             > https://github.com/mstksg/nonempty-containers and
>>             > https://github.com/andrewthad/non-empty-containers
>>             >                     demonstrate the
>>             >                     interest in non-empty maps and sets
>>             in particular.
>>             >                     My favorite
>>             >                     use-case is that they're needed to
>>             "curry"
>>             >                     containers: for example,
>>             >                     |Map (k0, k1) v| is isomorphic not
>>             to |Map k0 (Map
>>             >                     k1 v)| but to
>>             >                     |Map k0 (NonEmptyMap k1 v)|. I like
>>             this use-case
>>             >                     because it comes
>>             >                     from the containers themselves.
>>             >
>>             >                     Importantly, there's no good way to
>>             do this
>>             >                     outside of containers;
>>             >                     doing so leads to imbalancing /
>>             extra indirection,
>>             >                     or massive code
>>             >                     duplication. If one wraps the
>>             container was an
>>             >                     extra value like
>>             >                     Data.List.NonEmpty, one's left with
>>             an unavoidable
>>             >                     extra
>>             >  indirection/imbalance. One can rectify this by
>>             >                     copying and modifying
>>             >                     the implementation of containers,
>>             but that's
>>             >                     hardly maintainable;
>>             >                     even as though the algorithms are
>>             the same, enough
>>             >                     lines are touched
>>             >                     that merging upstream containers is
>>             nigh impossible.
>>             >
>>             >                     On the other hand, the non-empty
>>             containers can be
>>             >                     elegantly and
>>             >                     sufficiently implemented alongside
>>             their originals
>>             >                     by taking the Bin
>>             >                     constructor and breaking it out
>>             into it's own
>>             >                     type, mutually
>>             >                     recursive with the original. This
>>             avoids the
>>             >  indirection/imbalancing
>>             >                     and code duplication problems: the
>>             algorithms work
>>             >                     exactly as before
>>             >                     creating the same trees (remember
>>             the UNPACK), and
>>             >                     no code
>>             >                     duplicated since the functions
>>             become mutually
>>             >                     recursive matching
>>             >                     the types.
>>             >
>>             >                     To briefly summarize the thread:
>>             >
>>             >                     1. I proposed the issue after
>>             performing this same
>>             >                     refactor on the
>>             >                     dependent-map package:
>>             >
>>             https://github.com/obsidiansystems/dependent-map/tree/non-empty,
>>             >                     a fork of containers.
>>             >                     2. I made
>>             > https://github.com/haskell/containers/pull/616
>>             >                     which just
>>             >                     changes the types, to make sure
>>             UNPACK preserved
>>             >                     the importance.
>>             >                     3.
>>             >
>>             https://gist.github.com/Ericson2314/58709d0d99e0c0e83ad266701cd71841
>>             >                     the benchmarks showed rather than
>>             degrading
>>             >                     performance, PR 616
>>             >                     actually /improved/ it.
>>             >
>>             >                      If there is preliminary consensus,
>>             I'll make a
>>             >                     second PR on top
>>             >                     which generalizes the functions
>>             like on my
>>             >                     dependent-map branch.
>>             >
>>             >                     Thanks,
>>             >
>>             >                     John
>>             >
>>             >  _______________________________________________
>>             >                     Libraries mailing list
>>             > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>             >                     <mailto:Libraries at haskell.org
>>             <mailto:Libraries at haskell.org>>
>>             >                     <mailto:Libraries at haskell.org
>>             <mailto:Libraries at haskell.org>
>>             >                     <mailto:Libraries at haskell.org
>>             <mailto:Libraries at haskell.org>>>
>>             > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>             >
>>             >
>>             >  _______________________________________________
>>             >                     Libraries mailing list
>>             > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>             <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>>             > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>             >
>>             >
>>             >  _______________________________________________
>>             >                 Libraries mailing list
>>             > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>             <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>>             > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>             >
>>             >  _______________________________________________
>>             >             Libraries mailing list
>>             > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>             <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>>             > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>             >
>>             >  _______________________________________________
>>             >         Libraries mailing list
>>             > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>             <mailto:Libraries at haskell.org <mailto:Libraries at haskell.org>>
>>             > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>             >
>>             >
>>             > _______________________________________________
>>             > Libraries mailing list
>>             > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>             > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20190425/10012682/attachment-0001.html>


More information about the Libraries mailing list