ArrayArrays

Tue Sep 8 08:29:36 UTC 2015

Once you start to include all the other primitive types there is a bit more
of an explosion. MVar#, TVar#, MutVar#, Small variants, etc. can all be
modified to carry unlifted content.

Being able to be parametric over that choice would permit a number of
things in user land to do the same thing with an open-ended set of design
possibilities that are rather hard to contemplate in advance. e.g. being
able to abstract over them could let you just use a normal (,) to carry
around unlifted parametric data types or being able to talk about [MVar# s
a] drastically reducing the number of one off data types we need to invent.

If you can talk about the machinery mentioned above then you can have
typeclasses parameterized on an argument that could be either unlifted or
lifted.

I'm not willing to fight too hard for it, but it feels more like the
"right" solution than retaining a cut-and-paste copy of the same code and
bifurcating further on each argument you want to consider such a degree of
freedom.

As such it seems like a pretty big win for a comparatively minor change to
the levity polymorphism machinery.

-Edward

On Tue, Sep 8, 2015 at 3:40 AM, Simon Marlow <marlowsd at gmail.com> wrote:

> This would be very cool, however it's questionable whether it's worth it.
>
> Without any unlifted kind, we need
>  - ArrayArray#
>  - a set of new/read/write primops for every element type,
>    either built-in or made from unsafeCoerce#
>
> With the unlifted kind, we would need
>  - ArrayArray#
>  - one set of new/read/write primops
>
> With levity polymorphism, we would need
>  - none of this, Array# can be used
>
> So having an unlifted kind already kills a lot of the duplication,
> polymorphism only kills a bit more.
>
> Cheers
> Simon
>
> On 08/09/2015 00:14, Edward Kmett wrote:
>
>> Assume we had the ability to talk about Levity in a new way and instead
>> of just:
>>
>> data Levity = Lifted | Unlifted
>>
>> type * = TYPE 'Lifted
>> type # = TYPE 'Unlifted
>>
>> we replace had a more nuanced notion of TYPE parameterized on another
>> data type:
>>
>> data Levity = Lifted | Unlifted
>> data Param = Composite | Simple Levity
>>
>> and we parameterized TYPE with a Param rather than Levity.
>>
>> Existing strange representations can continue to live in TYPE 'Composite
>>
>> (# Int# , Double #) :: TYPE 'Composite
>>
>> and we don't support parametricity in there, just like, currently we
>> don't allow parametricity in #.
>>
>> We can include the undefined example from Richard's talk:
>>
>> undefined :: forall (v :: Param). v
>>
>> and ultimately lift it into his pi type when it is available just as
>> before.
>>
>> But we could let consider TYPE ('Simple 'Unlifted) as a form of
>> 'parametric #' covering unlifted things we're willing to allow
>> polymorphism over because they are just pointers to something in the
>> heap, that just happens to not be able to be _|_ or a thunk.
>>
>> In this setting, recalling that above, I modified Richard's TYPE to take
>> a Param instead of Levity, we can define a type alias for things that
>> live as a simple pointer to a heap allocated object:
>>
>> type GC (l :: Levity) = TYPE ('Simple l)
>> type * = GC 'Lifted
>>
>> and then we can look at existing primitives generalized:
>>
>> Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted
>> MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
>> SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted
>> SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC
>> 'Unlifted
>> MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
>> MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
>>
>> Weak#, StablePtr#, StableName#, etc. all can take similar modifications.
>>
>> Recall that an ArrayArray# was just an Array# hacked up to be able to
>> hold onto the subset of # that is collectable.
>>
>> Almost all of the operations on these data types can work on the more
>> general kind of argument.
>>
>> newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a ->
>> State# s -> (# State# s, MutableArray# s a #)
>>
>> writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray#
>> s a -> Int# -> a -> State# s -> State# s
>>
>> readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s
>> a -> Int# -> State# s -> (# State# s, a #)
>>
>> etc.
>>
>> Only a couple of our existing primitives _can't_ generalize this way.
>> The one that leaps to mind is atomicModifyMutVar, which would need to
>> stay constrained to only work on arguments in *, because of the way it
>> operates.
>>
>> With that we can still talk about
>>
>> MutableArray# s Int
>>
>> but now we can also talk about:
>>
>> MutableArray# s (MutableArray# s Int)
>>
>> without the layer of indirection through a box in * and without an
>> explosion of primops. The same newFoo, readFoo, writeFoo machinery works
>> for both kinds.
>>
>> The struct machinery doesn't get to take advantage of this, but it would
>> let us clean house elsewhere in Prim and drastically improve the range
>> of applicability of the existing primitives with nothing more than a
>> small change to the levity machinery.
>>
>> I'm not attached to any of the names above, I coined them just to give
>> us a concrete thing to talk about.
>>
>> Here I'm only proposing we extend machinery in GHC.Prim this way, but an
>> interesting 'now that the barn door is open' question is to consider
>> that our existing Haskell data types often admit a similar form of
>> parametricity and nothing in principle prevents this from working for
>> Maybe or [] and once you permit inference to fire across all of GC l
>> then it seems to me that you'd start to get those same capabilities
>> there as well when LevityPolymorphism was turned on.
>>
>> -Edward
>>
>> On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones
>> <simonpj at microsoft.com <mailto:simonpj at microsoft.com>> wrote:
>>
>>     This could make the menagerie of ways to pack
>>     {Small}{Mutable}Array{Array}# references into a
>>     {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing
>>     the need for folks to descend into the use of the more evil
>>     structure primitives we're talking about, and letting us keep a few
>>     more principles around us.____
>>
>>     __ __
>>
>>     I’m lost. Can you give some concrete examples that illustrate how
>>     levity polymorphism will help us?____
>>
>>
>>     Simon____
>>
>>     __ __
>>
>>     *From:*Edward Kmett [mailto:ekmett at gmail.com <mailto:ekmett at gmail.com
>> >]
>>     *Sent:* 07 September 2015 21:17
>>     *To:* Simon Peyton Jones
>>     *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T
>>     Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates
>>     *Subject:* Re: ArrayArrays____
>>
>>     __ __
>>
>>     I had a brief discussion with Richard during the Haskell Symposium
>>     about how we might be able to let parametricity help a bit in
>>     reducing the space of necessarily primops to a slightly more
>>     manageable level. ____
>>
>>     __ __
>>
>>     Notably, it'd be interesting to explore the ability to allow
>>     parametricity over the portion of # that is just a gcptr.____
>>
>>     __ __
>>
>>     We could do this if the levity polymorphism machinery was tweaked a
>>     bit. You could envision the ability to abstract over things in both
>>     * and the subset of # that are represented by a gcptr, then
>>     modifying the existing array primitives to be parametric in that
>>     choice of levity for their argument so long as it was of a "heap
>>     object" levity.____
>>
>>     __ __
>>
>>     This could make the menagerie of ways to pack
>>     {Small}{Mutable}Array{Array}# references into a
>>     {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing
>>     the need for folks to descend into the use of the more evil
>>     structure primitives we're talking about, and letting us keep a few
>>     more principles around us.____
>>
>>     __ __
>>
>>     Then in the cases like `atomicModifyMutVar#` where it needs to
>>     actually be in * rather than just a gcptr, due to the constructed
>>     field selectors it introduces on the heap then we could keep the
>>     existing less polymorphic type.____
>>
>>     __ __
>>
>>     -Edward____
>>
>>     __ __
>>
>>     On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones
>>     <simonpj at microsoft.com <mailto:simonpj at microsoft.com>> wrote:____
>>
>>         It was fun to meet and discuss this.____
>>
>>         ____
>>
>>         Did someone volunteer to write a wiki page that describes the
>>         proposed design?  And, I earnestly hope, also describes the
>>         menagerie of currently available array types and primops so that
>>         users can have some chance of picking the right one?!____
>>
>>         ____
>>
>>         Thanks____
>>
>>         ____
>>
>>         Simon____
>>
>>         ____
>>
>>         *From:*ghc-devs [mailto:ghc-devs-bounces at haskell.org
>>         <mailto:ghc-devs-bounces at haskell.org>] *On Behalf Of *Ryan Newton
>>         *Sent:* 31 August 2015 23:11
>>         *To:* Edward Kmett; Johan Tibell
>>         *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen;
>>         ghc-devs; Ryan Scott; Ryan Yates
>>         *Subject:* Re: ArrayArrays____
>>
>>         ____
>>
>>         Dear Edward, Ryan Yates, and other interested parties -- ____
>>
>>         ____
>>
>>         So when should we meet up about this?____
>>
>>         ____
>>
>>         May I propose the Tues afternoon break for everyone at ICFP who
>>         is interested in this topic?  We can meet out in the coffee area
>>         and congregate around Edward Kmett, who is tall and should be
>>         easy to find ;-).____
>>
>>         ____
>>
>>         I think Ryan is going to show us how to use his new primops for
>>         combined array + other fields in one heap object?____
>>
>>         ____
>>
>>         On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett <ekmett at gmail.com
>>         <mailto:ekmett at gmail.com>> wrote:____
>>
>>             Without a custom primitive it doesn't help much there, you
>>             have to store the indirection to the mask.____
>>
>>             ____
>>
>>             With a custom primitive it should cut the on heap
>>             root-to-leaf path of everything in the HAMT in half. A
>>             shorter HashMap was actually one of the motivating factors
>>             for me doing this. It is rather astoundingly difficult to
>>             beat the performance of HashMap, so I had to start cheating
>>             pretty badly. ;)____
>>
>>             ____
>>
>>             -Edward____
>>
>>             ____
>>
>>             On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell
>>             <johan.tibell at gmail.com <mailto:johan.tibell at gmail.com>>
>>             wrote:____
>>
>>                 I'd also be interested to chat at ICFP to see if I can
>>                 use this for my HAMT implementation.____
>>
>>                 ____
>>
>>                 On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett
>>                 <ekmett at gmail.com <mailto:ekmett at gmail.com>> wrote:____
>>
>>                     Sounds good to me. Right now I'm just hacking up
>>                     composable accessors for "typed slots" in a fairly
>>                     lens-like fashion, and treating the set of slots I
>>                     define and the 'new' function I build for the data
>>                     type as its API, and build atop that. This could
>>                     eventually graduate to template-haskell, but I'm not
>>                     entirely satisfied with the solution I have. I
>>                     currently distinguish between what I'm calling
>>                     "slots" (things that point directly to another
>>                     SmallMutableArrayArray# sans wrapper) and "fields"
>>                     which point directly to the usual Haskell data types
>>                     because unifying the two notions meant that I
>>                     couldn't lift some coercions out "far enough" to
>>                     make them vanish.____
>>
>>                     ____
>>
>>                     I'll be happy to run through my current working set
>>                     of issues in person and -- as things get nailed down
>>                     further -- in a longer lived medium than in personal
>>                     conversations. ;)____
>>
>>                     ____
>>
>>                     -Edward____
>>
>>                     ____
>>
>>                     On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton
>>                     <rrnewton at gmail.com <mailto:rrnewton at gmail.com>>
>>                     wrote:____
>>
>>                         I'd also love to meet up at ICFP and discuss
>>                         this.  I think the array primops plus a TH layer
>>                         that lets (ab)use them many times without too
>>                         much marginal cost sounds great.  And I'd like
>>                         to learn how we could be either early users of,
>>                         or help with, this infrastructure.____
>>
>>                         ____
>>
>>                         CC'ing in Ryan Scot and Omer Agacan who may also
>>                         be interested in dropping in on such discussions
>>                         @ICFP, and Chao-Hong Chen, a Ph.D. student who
>>                         is currently working on concurrent data
>>                         structures in Haskell, but will not be at
>> ICFP.____
>>
>>                         ____
>>
>>                         ____
>>
>>                         On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates
>>                         <fryguybob at gmail.com
>>                         <mailto:fryguybob at gmail.com>> wrote:____
>>
>>                             I completely agree.  I would love to spend
>>                             some time during ICFP and
>>                             friends talking about what it could look
>>                             like.  My small array for STM
>>                             changes for the RTS can be seen here [1].
>>                             It is on a branch somewhere
>>                             between 7.8 and 7.10 and includes irrelevant
>>                             STM bits and some
>>                             confusing naming choices (sorry), but should
>>                             cover all the details
>>                             needed to implement it for a non-STM
>>                             context.  The biggest surprise
>>                             for me was following small array too closely
>>                             and having a word/byte
>>                             offset miss-match [2].
>>
>>                             [1]:
>>
>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
>>                             [2]:
>>                             https://ghc.haskell.org/trac/ghc/ticket/10413
>>
>>                             Ryan____
>>
>>
>>                             On Fri, Aug 28, 2015 at 10:09 PM, Edward
>>                             Kmett <ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>> wrote:
>>                              > I'd love to have that last 10%, but its a
>>                             lot of work to get there and more
>>                              > importantly I don't know quite what it
>>                             should look like.
>>                              >
>>                              > On the other hand, I do have a pretty
>>                             good idea of how the primitives above
>>                              > could be banged out and tested in a long
>>                             evening, well in time for 7.12. And
>>                              > as noted earlier, those remain useful
>>                             even if a nicer typed version with an
>>                              > extra level of indirection to the sizes
>>                             is built up after.
>>                              >
>>                              > The rest sounds like a good graduate
>>                             student project for someone who has
>>                              > graduate students lying around. Maybe
>>                             somebody at Indiana University who has
>>                              > an interest in type theory and
>>                             parallelism can find us one. =)
>>                              >
>>                              > -Edward
>>                              >
>>                              > On Fri, Aug 28, 2015 at 8:48 PM, Ryan
>>                             Yates <fryguybob at gmail.com
>>                             <mailto:fryguybob at gmail.com>> wrote:
>>                              >>
>>                              >> I think from my perspective, the
>>                             motivation for getting the type
>>                              >> checker involved is primarily bringing
>>                             this to the level where users
>>                              >> could be expected to build these
>>                             structures.  it is reasonable to
>>                              >> think that there are people who want to
>>                             use STM (a context with
>>                              >> mutation already) to implement a
>>                             straight forward data structure that
>>                              >> avoids extra indirection penalty.  There
>>                             should be some places where
>>                              >> knowing that things are field accesses
>>                             rather then array indexing
>>                              >> could be helpful, but I think GHC is
>>                             good right now about handling
>>                              >> constant offsets.  In my code I don't do
>>                             any bounds checking as I know
>>                              >> I will only be accessing my arrays with
>>                             constant indexes.  I make
>>                              >> wrappers for each field access and leave
>>                             all the unsafe stuff in
>>                              >> there.  When things go wrong though, the
>>                             compiler is no help.  Maybe
>>                              >> template Haskell that generates the
>>                             appropriate wrappers is the right
>>                              >> direction to go.
>>                              >> There is another benefit for me when
>>                             working with these as arrays in
>>                              >> that it is quite simple and direct
>>                             (given the hoops already jumped
>>                              >> through) to play with alignment.  I can
>>                             ensure two pointers are never
>>                              >> on the same cache-line by just spacing
>>                             things out in the array.
>>                              >>
>>                              >> On Fri, Aug 28, 2015 at 7:33 PM, Edward
>>                             Kmett <ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>> wrote:
>>                              >> > They just segfault at this level. ;)
>>                              >> >
>>                              >> > Sent from my iPhone
>>                              >> >
>>                              >> > On Aug 28, 2015, at 7:25 PM, Ryan
>>                             Newton <rrnewton at gmail.com
>>                             <mailto:rrnewton at gmail.com>> wrote:
>>                              >> >
>>                              >> > You presumably also save a bounds
>>                             check on reads by hard-coding the
>>                              >> > sizes?
>>                              >> >
>>                              >> > On Fri, Aug 28, 2015 at 3:39 PM,
>>                             Edward Kmett <ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>> wrote:
>>                              >> >>
>>                              >> >> Also there are 4 different "things"
>>                             here, basically depending on two
>>                              >> >> independent questions:
>>                              >> >>
>>                              >> >> a.) if you want to shove the sizes
>>                             into the info table, and
>>                              >> >> b.) if you want cardmarking.
>>                              >> >>
>>                              >> >> Versions with/without cardmarking for
>>                             different sizes can be done
>>                              >> >> pretty
>>                              >> >> easily, but as noted, the infotable
>>                             variants are pretty invasive.
>>                              >> >>
>>                              >> >> -Edward
>>                              >> >>
>>                              >> >> On Fri, Aug 28, 2015 at 6:36 PM,
>>                             Edward Kmett <ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>> wrote:
>>                              >> >>>
>>                              >> >>> Well, on the plus side you'd save 16
>>                             bytes per object, which adds up
>>                              >> >>> if
>>                              >> >>> they were small enough and there are
>>                             enough of them. You get a bit
>>                              >> >>> better
>>                              >> >>> locality of reference in terms of
>>                             what fits in the first cache line of
>>                              >> >>> them.
>>                              >> >>>
>>                              >> >>> -Edward
>>                              >> >>>
>>                              >> >>> On Fri, Aug 28, 2015 at 6:14 PM,
>>                             Ryan Newton <rrnewton at gmail.com
>>                             <mailto:rrnewton at gmail.com>>
>>                              >> >>> wrote:
>>                              >> >>>>
>>                              >> >>>> Yes. And for the short term I can
>>                             imagine places we will settle with
>>                              >> >>>> arrays even if it means tracking
>>                             lengths unnecessarily and
>>                              >> >>>> unsafeCoercing
>>                              >> >>>> pointers whose types don't actually
>>                             match their siblings.
>>                              >> >>>>
>>                              >> >>>> Is there anything to recommend the
>>                             hacks mentioned for fixed sized
>>                              >> >>>> array
>>                              >> >>>> objects *other* than using them to
>>                             fake structs? (Much to
>>                              >> >>>> derecommend, as
>>                              >> >>>> you mentioned!)
>>                              >> >>>>
>>                              >> >>>> On Fri, Aug 28, 2015 at 3:07 PM
>>                             Edward Kmett <ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>>
>>                              >> >>>> wrote:
>>                              >> >>>>>
>>                              >> >>>>> I think both are useful, but the
>>                             one you suggest requires a lot more
>>                              >> >>>>> plumbing and doesn't subsume all
>>                             of the usecases of the other.
>>                              >> >>>>>
>>                              >> >>>>> -Edward
>>                              >> >>>>>
>>                              >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM,
>>                             Ryan Newton <rrnewton at gmail.com
>>                             <mailto:rrnewton at gmail.com>>
>>                              >> >>>>> wrote:
>>                              >> >>>>>>
>>                              >> >>>>>> So that primitive is an array
>>                             like thing (Same pointed type,
>>                              >> >>>>>> unbounded
>>                              >> >>>>>> length) with extra payload.
>>                              >> >>>>>>
>>                              >> >>>>>> I can see how we can do without
>>                             structs if we have arrays,
>>                              >> >>>>>> especially
>>                              >> >>>>>> with the extra payload at front.
>>                             But wouldn't the general solution
>>                              >> >>>>>> for
>>                              >> >>>>>> structs be one that that allows
>>                             new user data type defs for #
>>                              >> >>>>>> types?
>>                              >> >>>>>>
>>                              >> >>>>>>
>>                              >> >>>>>>
>>                              >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM
>>                             Edward Kmett <ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>>
>>                              >> >>>>>> wrote:
>>                              >> >>>>>>>
>>                              >> >>>>>>> Some form of MutableStruct# with
>>                             a known number of words and a
>>                              >> >>>>>>> known
>>                              >> >>>>>>> number of pointers is basically
>>                             what Ryan Yates was suggesting
>>                              >> >>>>>>> above, but
>>                              >> >>>>>>> where the word counts were
>>                             stored in the objects themselves.
>>                              >> >>>>>>>
>>                              >> >>>>>>> Given that it'd have a couple of
>>                             words for those counts it'd
>>                              >> >>>>>>> likely
>>                              >> >>>>>>> want to be something we build in
>>                             addition to MutVar# rather than a
>>                              >> >>>>>>> replacement.
>>                              >> >>>>>>>
>>                              >> >>>>>>> On the other hand, if we had to
>>                             fix those numbers and build info
>>                              >> >>>>>>> tables that knew them, and
>>                             typechecker support, for instance, it'd
>>                              >> >>>>>>> get
>>                              >> >>>>>>> rather invasive.
>>                              >> >>>>>>>
>>                              >> >>>>>>> Also, a number of things that we
>>                             can do with the 'sized' versions
>>                              >> >>>>>>> above, like working with evil
>>                             unsized c-style arrays directly
>>                              >> >>>>>>> inline at the
>>                              >> >>>>>>> end of the structure cease to be
>>                             possible, so it isn't even a pure
>>                              >> >>>>>>> win if we
>>                              >> >>>>>>> did the engineering effort.
>>                              >> >>>>>>>
>>                              >> >>>>>>> I think 90% of the needs I have
>>                             are covered just by adding the one
>>                              >> >>>>>>> primitive. The last 10% gets
>>                             pretty invasive.
>>                              >> >>>>>>>
>>                              >> >>>>>>> -Edward
>>                              >> >>>>>>>
>>                              >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM,
>>                             Ryan Newton <rrnewton at gmail.com
>>                             <mailto:rrnewton at gmail.com>>
>>                              >> >>>>>>> wrote:
>>                              >> >>>>>>>>
>>                              >> >>>>>>>> I like the possibility of a
>>                             general solution for mutable structs
>>                              >> >>>>>>>> (like Ed said), and I'm trying
>>                             to fully understand why it's hard.
>>                              >> >>>>>>>>
>>                              >> >>>>>>>> So, we can't unpack MutVar into
>>                             constructors because of object
>>                              >> >>>>>>>> identity problems. But what
>>                             about directly supporting an
>>                              >> >>>>>>>> extensible set of
>>                              >> >>>>>>>> unlifted MutStruct# objects,
>>                             generalizing (and even replacing)
>>                              >> >>>>>>>> MutVar#? That
>>                              >> >>>>>>>> may be too much work, but is it
>>                             problematic otherwise?
>>                              >> >>>>>>>>
>>                              >> >>>>>>>> Needless to say, this is also
>>                             critical if we ever want best in
>>                              >> >>>>>>>> class
>>                              >> >>>>>>>> lockfree mutable structures,
>>                             just like their Stm and sequential
>>                              >> >>>>>>>> counterparts.
>>                              >> >>>>>>>>
>>                              >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM
>>                             Simon Peyton Jones
>>                              >> >>>>>>>> <simonpj at microsoft.com
>>                             <mailto:simonpj at microsoft.com>> wrote:
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> At the very least I'll take
>>                             this email and turn it into a short
>>                              >> >>>>>>>>> article.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Yes, please do make it into a
>>                             wiki page on the GHC Trac, and
>>                              >> >>>>>>>>> maybe
>>                              >> >>>>>>>>> make a ticket for it.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Thanks
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Simon
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> From: Edward Kmett
>>                             [mailto:ekmett at gmail.com
>>                             <mailto:ekmett at gmail.com>]
>>                              >> >>>>>>>>> Sent: 27 August 2015 16:54
>>                              >> >>>>>>>>> To: Simon Peyton Jones
>>                              >> >>>>>>>>> Cc: Manuel M T Chakravarty;
>>                             Simon Marlow; ghc-devs
>>                              >> >>>>>>>>> Subject: Re: ArrayArrays
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> An ArrayArray# is just an
>>                             Array# with a modified invariant. It
>>                              >> >>>>>>>>> points directly to other
>>                             unlifted ArrayArray#'s or ByteArray#'s.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> While those live in #, they
>>                             are garbage collected objects, so
>>                              >> >>>>>>>>> this
>>                              >> >>>>>>>>> all lives on the heap.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> They were added to make some
>>                             of the DPH stuff fast when it has
>>                              >> >>>>>>>>> to
>>                              >> >>>>>>>>> deal with nested arrays.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> I'm currently abusing them as
>>                             a placeholder for a better thing.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> The Problem
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> -----------------
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Consider the scenario where
>>                             you write a classic doubly-linked
>>                              >> >>>>>>>>> list
>>                              >> >>>>>>>>> in Haskell.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> data DLL = DLL (IORef (Maybe
>>                             DLL) (IORef (Maybe DLL)
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Chasing from one DLL to the
>>                             next requires following 3 pointers
>>                              >> >>>>>>>>> on
>>                              >> >>>>>>>>> the heap.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~>
>>                             MutVar# RealWorld (Maybe DLL) ~>
>>                              >> >>>>>>>>> Maybe
>>                              >> >>>>>>>>> DLL ~> DLL
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> That is 3 levels of indirection.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> We can trim one by simply
>>                             unpacking the IORef with
>>                              >> >>>>>>>>> -funbox-strict-fields or UNPACK
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> We can trim another by adding
>>                             a 'Nil' constructor for DLL and
>>                              >> >>>>>>>>> worsening our representation.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> data DLL = DLL !(IORef DLL)
>>                             !(IORef DLL) | Nil
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> but now we're still stuck with
>>                             a level of indirection
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL
>>                             ~> DLL
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> This means that every
>>                             operation we perform on this structure
>>                              >> >>>>>>>>> will
>>                              >> >>>>>>>>> be about half of the speed of
>>                             an implementation in most other
>>                              >> >>>>>>>>> languages
>>                              >> >>>>>>>>> assuming we're memory bound on
>>                             loading things into cache!
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Making Progress
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> ----------------------
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> I have been working on a
>>                             number of data structures where the
>>                              >> >>>>>>>>> indirection of going from
>>                             something in * out to an object in #
>>                              >> >>>>>>>>> which
>>                              >> >>>>>>>>> contains the real pointer to
>>                             my target and coming back
>>                              >> >>>>>>>>> effectively doubles
>>                              >> >>>>>>>>> my runtime.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> We go out to the MutVar#
>>                             because we are allowed to put the
>>                              >> >>>>>>>>> MutVar#
>>                              >> >>>>>>>>> onto the mutable list when we
>>                             dirty it. There is a well defined
>>                              >> >>>>>>>>> write-barrier.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> I could change out the
>>                             representation to use
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> data DLL = DLL (MutableArray#
>>                             RealWorld DLL) | Nil
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> I can just store two pointers
>>                             in the MutableArray# every time,
>>                              >> >>>>>>>>> but
>>                              >> >>>>>>>>> this doesn't help _much_
>>                             directly. It has reduced the amount of
>>                              >> >>>>>>>>> distinct
>>                              >> >>>>>>>>> addresses in memory I touch on
>>                             a walk of the DLL from 3 per
>>                              >> >>>>>>>>> object to 2.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> I still have to go out to the
>>                             heap from my DLL and get to the
>>                              >> >>>>>>>>> array
>>                              >> >>>>>>>>> object and then chase it to
>>                             the next DLL and chase that to the
>>                              >> >>>>>>>>> next array. I
>>                              >> >>>>>>>>> do get my two pointers
>>                             together in memory though. I'm paying for
>>                              >> >>>>>>>>> a card
>>                              >> >>>>>>>>> marking table as well, which I
>>                             don't particularly need with just
>>                              >> >>>>>>>>> two
>>                              >> >>>>>>>>> pointers, but we can shed that
>>                             with the "SmallMutableArray#"
>>                              >> >>>>>>>>> machinery added
>>                              >> >>>>>>>>> back in 7.10, which is just
>>                             the old array code a a new data
>>                              >> >>>>>>>>> type, which can
>>                              >> >>>>>>>>> speed things up a bit when you
>>                             don't have very big arrays:
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> data DLL = DLL
>>                             (SmallMutableArray# RealWorld DLL) | Nil
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> But what if I wanted my object
>>                             itself to live in # and have two
>>                              >> >>>>>>>>> mutable fields and be able to
>>                             share the sme write barrier?
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> An ArrayArray# points directly
>>                             to other unlifted array types.
>>                              >> >>>>>>>>> What
>>                              >> >>>>>>>>> if we have one # -> * wrapper
>>                             on the outside to deal with the
>>                              >> >>>>>>>>> impedence
>>                              >> >>>>>>>>> mismatch between the
>>                             imperative world and Haskell, and then just
>>                              >> >>>>>>>>> let the
>>                              >> >>>>>>>>> ArrayArray#'s hold other
>>                             arrayarrays.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> data DLL = DLL
>>                             (MutableArrayArray# RealWorld)
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> now I need to make up a new
>>                             Nil, which I can just make be a
>>                              >> >>>>>>>>> special
>>                              >> >>>>>>>>> MutableArrayArray# I allocate
>>                             on program startup. I can even
>>                              >> >>>>>>>>> abuse pattern
>>                              >> >>>>>>>>> synonyms. Alternately I can
>>                             exploit the internals further to
>>                              >> >>>>>>>>> make this
>>                              >> >>>>>>>>> cheaper.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> Then I can use the
>>                             readMutableArrayArray# and
>>                              >> >>>>>>>>> writeMutableArrayArray# calls
>>                             to directly access the preceding
>>                              >> >>>>>>>>> and next
>>                              >> >>>>>>>>> entry in the linked list.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> So now we have one DLL wrapper
>>                             which just 'bootstraps me' into a
>>                              >> >>>>>>>>> strict world, and everything
>>                             there lives in #.
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> next :: DLL -> IO DLL
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> next (DLL m) = IO $ \s -> case
>>                             readMutableArrayArray# s of
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>    (# s', n #) -> (# s', DLL n
>> #)
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>>
>>                              >> >>>>>>>>> It turns out GHC is quite
>>                             happy to optimize all of that code to
>>                              >> >>>>>>>>> keep things unboxed. The 'DLL'
>>                             wrappers get removed pretty
>>                              >> >>>>>>>>> easily when they
>>                              >> >>>>>>>>> are known strict and you chain
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150908/1678d6f2/attachment-0001.html>