Proposal: Partitionable goes somewhere + containers instances

Edward Kmett ekmett
Mon Oct 7 14:29:47 UTC 2013


A rule of thumb that has served me well w.r.t exposing internal modules is
to expose a Data.Foo.Internal but make it clear it is a very fragile
interface.

Even by going to far as to say this module does not follow the PVP and that
they should expect breaking changes to come fast and often. Users should
only safely depend on it with minor-version specific bounds then. This
ameliorates the concerns about how it ties your hands as an implementor.

Breaking this API shouldn't require discussion on the mailing lists, as it
is an internal implementation detail. This should further ameliorate
concerns about it tying your hands as an implementor.

This lets users who need to write performant code not have to fork the
entire package. (I've had to do this with Map before, Text and other
packages that have been rather hide-bound about not exposing implementation
details, it sucks.)

My experience is maybe 1-2% of your users need it, but when they need it it
is the difference between the package being usable or having to be
completely replaced with something else. They are also the kind of users
who understand the need for the best possible implementation and who will
roll with the punches.

Chasing after changes in the implementation is generally far less work than
maintaining an entire fork.

-Edward



On Mon, Oct 7, 2013 at 4:10 AM, Milan Straka <fox at ucw.cz> wrote:

> Hi all,
>
> > -----Original message-----
> > From: Ryan Newton <rrnewton at gmail.com>
> > Sent: 7 Oct 2013, 00:16
> >
> > Ok, so we've narrowed the focus quite a bit to JUST exposing enough from
> > containers to enable a third-party library to do all the parallel
> > traversals it wants.  Which of the following limited proposal would
> people
> > like more?
> >
> > (1) Expose Bin/Tip from, say, Data.Map.Internal, as in this patch:
> >
> >
> https://github.com/rrnewton/containers/commit/5d6b07f69e8396023101039a4aaab619af41c810
> >
> > (2) a splitTree function [1].  A patch can be found here:
> >
> https://github.com/rrnewton/containers/commit/6153896f0c7e6cdf70656dc6b641ce61711175f8
> >
> > The argument for (1) would be that it doesn't pollute any namespaces
> people
> > actually use at all, and Tip & Bin would seem to be pretty darn stable at
> > this point.  The only consumers of this information in practice would be
> > downstream companion libraries (like, say, a parallel traversals library
> > for monad-par & LVish!)  Those could be updated if there were ever a
> > seismic shift in the containers implementations.
>
> I am strongly against (1). Exposing internal representation seem really
> wrong. FYI, I am planning to change the representation of Data.Map and
> Data.Set to a three constructor representation (I have already some
> benchmarks, halving time complexity of fold and decreasing memory usage
> by ~ 20%). So no, the internal representation is subject to change and
> I do not want it to become part of API.
>
>
> As for (2), I am not very happy about the type -- returning _three_ maps
> makes some assumptions about the internal representation. This can be
> seen when considering IntMap.splitTree -- there are no three IntMaps to
> return in a IntMap.splitTree, only two.
>
> What about the list version
>   splitTree :: Map k a -> [Map k a]
> If splitTree is INLINE, I think we can assume the deforestation will
> happen. That would allow us to define splitTree for IntMap too.
> If someone is worried, could they check that deforestation does really
> happen?
>
> Cheers,
> Milan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20131007/b5107d2d/attachment.html>




More information about the Libraries mailing list