ArrayArrays

Edward Kmett ekmett at gmail.com
Mon Sep 7 20:31:57 UTC 2015


Indeed. I can CAS today with appropriately coerced primitives.

-Edward

On Mon, Sep 7, 2015 at 4:27 PM, Ryan Newton <rrnewton at gmail.com> wrote:

> Ah, incidentally that introduces an interesting difference between
> atomicModify and CAS.  CAS should be able to work on mutable locations in
> that subset of # that are represented by a gcptr, whereas Edward pointed
> out that atomicModify cannot.
>
> (Indeed, to use lock-free algorithms with these new unboxed mutable
> structures we'll need CAS on the slots.)
>
> On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett <ekmett at gmail.com> wrote:
>
>> I had a brief discussion with Richard during the Haskell Symposium about
>> how we might be able to let parametricity help a bit in reducing the space
>> of necessarily primops to a slightly more manageable level.
>>
>> Notably, it'd be interesting to explore the ability to allow
>> parametricity over the portion of # that is just a gcptr.
>>
>> We could do this if the levity polymorphism machinery was tweaked a bit.
>> You could envision the ability to abstract over things in both * and the
>> subset of # that are represented by a gcptr, then modifying the existing
>> array primitives to be parametric in that choice of levity for their
>> argument so long as it was of a "heap object" levity.
>>
>> This could make the menagerie of ways to pack
>> {Small}{Mutable}Array{Array}# references into a
>> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the
>> need for folks to descend into the use of the more evil structure
>> primitives we're talking about, and letting us keep a few more principles
>> around us.
>>
>> Then in the cases like `atomicModifyMutVar#` where it needs to actually
>> be in * rather than just a gcptr, due to the constructed field selectors it
>> introduces on the heap then we could keep the existing less polymorphic
>> type.
>>
>> -Edward
>>
>> On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones <simonpj at microsoft.com
>> > wrote:
>>
>>> It was fun to meet and discuss this.
>>>
>>>
>>>
>>> Did someone volunteer to write a wiki page that describes the proposed
>>> design?  And, I earnestly hope, also describes the menagerie of currently
>>> available array types and primops so that users can have some chance of
>>> picking the right one?!
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Simon
>>>
>>>
>>>
>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Ryan
>>> Newton
>>> *Sent:* 31 August 2015 23:11
>>> *To:* Edward Kmett; Johan Tibell
>>> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs;
>>> Ryan Scott; Ryan Yates
>>> *Subject:* Re: ArrayArrays
>>>
>>>
>>>
>>> Dear Edward, Ryan Yates, and other interested parties --
>>>
>>>
>>>
>>> So when should we meet up about this?
>>>
>>>
>>>
>>> May I propose the Tues afternoon break for everyone at ICFP who is
>>> interested in this topic?  We can meet out in the coffee area and
>>> congregate around Edward Kmett, who is tall and should be easy to find ;-).
>>>
>>>
>>>
>>> I think Ryan is going to show us how to use his new primops for combined
>>> array + other fields in one heap object?
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett <ekmett at gmail.com> wrote:
>>>
>>> Without a custom primitive it doesn't help much there, you have to store
>>> the indirection to the mask.
>>>
>>>
>>>
>>> With a custom primitive it should cut the on heap root-to-leaf path of
>>> everything in the HAMT in half. A shorter HashMap was actually one of the
>>> motivating factors for me doing this. It is rather astoundingly difficult
>>> to beat the performance of HashMap, so I had to start cheating pretty
>>> badly. ;)
>>>
>>>
>>>
>>> -Edward
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell <johan.tibell at gmail.com>
>>> wrote:
>>>
>>> I'd also be interested to chat at ICFP to see if I can use this for my
>>> HAMT implementation.
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett <ekmett at gmail.com> wrote:
>>>
>>> Sounds good to me. Right now I'm just hacking up composable accessors
>>> for "typed slots" in a fairly lens-like fashion, and treating the set of
>>> slots I define and the 'new' function I build for the data type as its API,
>>> and build atop that. This could eventually graduate to template-haskell,
>>> but I'm not entirely satisfied with the solution I have. I currently
>>> distinguish between what I'm calling "slots" (things that point directly to
>>> another SmallMutableArrayArray# sans wrapper) and "fields" which point
>>> directly to the usual Haskell data types because unifying the two notions
>>> meant that I couldn't lift some coercions out "far enough" to make them
>>> vanish.
>>>
>>>
>>>
>>> I'll be happy to run through my current working set of issues in person
>>> and -- as things get nailed down further -- in a longer lived medium than
>>> in personal conversations. ;)
>>>
>>>
>>>
>>> -Edward
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton <rrnewton at gmail.com> wrote:
>>>
>>> I'd also love to meet up at ICFP and discuss this.  I think the array
>>> primops plus a TH layer that lets (ab)use them many times without too much
>>> marginal cost sounds great.  And I'd like to learn how we could be either
>>> early users of, or help with, this infrastructure.
>>>
>>>
>>>
>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in
>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student
>>> who is currently working on concurrent data structures in Haskell, but will
>>> not be at ICFP.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates <fryguybob at gmail.com> wrote:
>>>
>>> I completely agree.  I would love to spend some time during ICFP and
>>> friends talking about what it could look like.  My small array for STM
>>> changes for the RTS can be seen here [1].  It is on a branch somewhere
>>> between 7.8 and 7.10 and includes irrelevant STM bits and some
>>> confusing naming choices (sorry), but should cover all the details
>>> needed to implement it for a non-STM context.  The biggest surprise
>>> for me was following small array too closely and having a word/byte
>>> offset miss-match [2].
>>>
>>> [1]:
>>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413
>>>
>>> Ryan
>>>
>>>
>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett <ekmett at gmail.com> wrote:
>>> > I'd love to have that last 10%, but its a lot of work to get there and
>>> more
>>> > importantly I don't know quite what it should look like.
>>> >
>>> > On the other hand, I do have a pretty good idea of how the primitives
>>> above
>>> > could be banged out and tested in a long evening, well in time for
>>> 7.12. And
>>> > as noted earlier, those remain useful even if a nicer typed version
>>> with an
>>> > extra level of indirection to the sizes is built up after.
>>> >
>>> > The rest sounds like a good graduate student project for someone who
>>> has
>>> > graduate students lying around. Maybe somebody at Indiana University
>>> who has
>>> > an interest in type theory and parallelism can find us one. =)
>>> >
>>> > -Edward
>>> >
>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates <fryguybob at gmail.com>
>>> wrote:
>>> >>
>>> >> I think from my perspective, the motivation for getting the type
>>> >> checker involved is primarily bringing this to the level where users
>>> >> could be expected to build these structures.  it is reasonable to
>>> >> think that there are people who want to use STM (a context with
>>> >> mutation already) to implement a straight forward data structure that
>>> >> avoids extra indirection penalty.  There should be some places where
>>> >> knowing that things are field accesses rather then array indexing
>>> >> could be helpful, but I think GHC is good right now about handling
>>> >> constant offsets.  In my code I don't do any bounds checking as I know
>>> >> I will only be accessing my arrays with constant indexes.  I make
>>> >> wrappers for each field access and leave all the unsafe stuff in
>>> >> there.  When things go wrong though, the compiler is no help.  Maybe
>>> >> template Haskell that generates the appropriate wrappers is the right
>>> >> direction to go.
>>> >> There is another benefit for me when working with these as arrays in
>>> >> that it is quite simple and direct (given the hoops already jumped
>>> >> through) to play with alignment.  I can ensure two pointers are never
>>> >> on the same cache-line by just spacing things out in the array.
>>> >>
>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett <ekmett at gmail.com>
>>> wrote:
>>> >> > They just segfault at this level. ;)
>>> >> >
>>> >> > Sent from my iPhone
>>> >> >
>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton <rrnewton at gmail.com>
>>> wrote:
>>> >> >
>>> >> > You presumably also save a bounds check on reads by hard-coding the
>>> >> > sizes?
>>> >> >
>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <ekmett at gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Also there are 4 different "things" here, basically depending on
>>> two
>>> >> >> independent questions:
>>> >> >>
>>> >> >> a.) if you want to shove the sizes into the info table, and
>>> >> >> b.) if you want cardmarking.
>>> >> >>
>>> >> >> Versions with/without cardmarking for different sizes can be done
>>> >> >> pretty
>>> >> >> easily, but as noted, the infotable variants are pretty invasive.
>>> >> >>
>>> >> >> -Edward
>>> >> >>
>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <ekmett at gmail.com>
>>> wrote:
>>> >> >>>
>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which adds
>>> up
>>> >> >>> if
>>> >> >>> they were small enough and there are enough of them. You get a bit
>>> >> >>> better
>>> >> >>> locality of reference in terms of what fits in the first cache
>>> line of
>>> >> >>> them.
>>> >> >>>
>>> >> >>> -Edward
>>> >> >>>
>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton <rrnewton at gmail.com>
>>> >> >>> wrote:
>>> >> >>>>
>>> >> >>>> Yes. And for the short term I can imagine places we will settle
>>> with
>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
>>> >> >>>> unsafeCoercing
>>> >> >>>> pointers whose types don't actually match their siblings.
>>> >> >>>>
>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed
>>> sized
>>> >> >>>> array
>>> >> >>>> objects *other* than using them to fake structs? (Much to
>>> >> >>>> derecommend, as
>>> >> >>>> you mentioned!)
>>> >> >>>>
>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett <ekmett at gmail.com>
>>> >> >>>> wrote:
>>> >> >>>>>
>>> >> >>>>> I think both are useful, but the one you suggest requires a lot
>>> more
>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other.
>>> >> >>>>>
>>> >> >>>>> -Edward
>>> >> >>>>>
>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton <
>>> rrnewton at gmail.com>
>>> >> >>>>> wrote:
>>> >> >>>>>>
>>> >> >>>>>> So that primitive is an array like thing (Same pointed type,
>>> >> >>>>>> unbounded
>>> >> >>>>>> length) with extra payload.
>>> >> >>>>>>
>>> >> >>>>>> I can see how we can do without structs if we have arrays,
>>> >> >>>>>> especially
>>> >> >>>>>> with the extra payload at front. But wouldn't the general
>>> solution
>>> >> >>>>>> for
>>> >> >>>>>> structs be one that that allows new user data type defs for #
>>> >> >>>>>> types?
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett <ekmett at gmail.com
>>> >
>>> >> >>>>>> wrote:
>>> >> >>>>>>>
>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a
>>> >> >>>>>>> known
>>> >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting
>>> >> >>>>>>> above, but
>>> >> >>>>>>> where the word counts were stored in the objects themselves.
>>> >> >>>>>>>
>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd
>>> >> >>>>>>> likely
>>> >> >>>>>>> want to be something we build in addition to MutVar# rather
>>> than a
>>> >> >>>>>>> replacement.
>>> >> >>>>>>>
>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build
>>> info
>>> >> >>>>>>> tables that knew them, and typechecker support, for instance,
>>> it'd
>>> >> >>>>>>> get
>>> >> >>>>>>> rather invasive.
>>> >> >>>>>>>
>>> >> >>>>>>> Also, a number of things that we can do with the 'sized'
>>> versions
>>> >> >>>>>>> above, like working with evil unsized c-style arrays directly
>>> >> >>>>>>> inline at the
>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even a
>>> pure
>>> >> >>>>>>> win if we
>>> >> >>>>>>> did the engineering effort.
>>> >> >>>>>>>
>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding
>>> the one
>>> >> >>>>>>> primitive. The last 10% gets pretty invasive.
>>> >> >>>>>>>
>>> >> >>>>>>> -Edward
>>> >> >>>>>>>
>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton <
>>> rrnewton at gmail.com>
>>> >> >>>>>>> wrote:
>>> >> >>>>>>>>
>>> >> >>>>>>>> I like the possibility of a general solution for mutable
>>> structs
>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's
>>> hard.
>>> >> >>>>>>>>
>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of
>>> object
>>> >> >>>>>>>> identity problems. But what about directly supporting an
>>> >> >>>>>>>> extensible set of
>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even
>>> replacing)
>>> >> >>>>>>>> MutVar#? That
>>> >> >>>>>>>> may be too much work, but is it problematic otherwise?
>>> >> >>>>>>>>
>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best
>>> in
>>> >> >>>>>>>> class
>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and
>>> sequential
>>> >> >>>>>>>> counterparts.
>>> >> >>>>>>>>
>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones
>>> >> >>>>>>>> <simonpj at microsoft.com> wrote:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a
>>> short
>>> >> >>>>>>>>> article.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and
>>> >> >>>>>>>>> maybe
>>> >> >>>>>>>>> make a ticket for it.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Thanks
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Simon
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com]
>>> >> >>>>>>>>> Sent: 27 August 2015 16:54
>>> >> >>>>>>>>> To: Simon Peyton Jones
>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>> >> >>>>>>>>> Subject: Re: ArrayArrays
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant.
>>> It
>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or
>>> ByteArray#'s.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> While those live in #, they are garbage collected objects,
>>> so
>>> >> >>>>>>>>> this
>>> >> >>>>>>>>> all lives on the heap.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it
>>> has
>>> >> >>>>>>>>> to
>>> >> >>>>>>>>> deal with nested arrays.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better
>>> thing.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> The Problem
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> -----------------
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Consider the scenario where you write a classic
>>> doubly-linked
>>> >> >>>>>>>>> list
>>> >> >>>>>>>>> in Haskell.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3
>>> pointers
>>> >> >>>>>>>>> on
>>> >> >>>>>>>>> the heap.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~>
>>> >> >>>>>>>>> Maybe
>>> >> >>>>>>>>> DLL ~> DLL
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> That is 3 levels of indirection.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with
>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL
>>> and
>>> >> >>>>>>>>> worsening our representation.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> but now we're still stuck with a level of indirection
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> This means that every operation we perform on this structure
>>> >> >>>>>>>>> will
>>> >> >>>>>>>>> be about half of the speed of an implementation in most
>>> other
>>> >> >>>>>>>>> languages
>>> >> >>>>>>>>> assuming we're memory bound on loading things into cache!
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Making Progress
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> ----------------------
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I have been working on a number of data structures where the
>>> >> >>>>>>>>> indirection of going from something in * out to an object
>>> in #
>>> >> >>>>>>>>> which
>>> >> >>>>>>>>> contains the real pointer to my target and coming back
>>> >> >>>>>>>>> effectively doubles
>>> >> >>>>>>>>> my runtime.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the
>>> >> >>>>>>>>> MutVar#
>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well
>>> defined
>>> >> >>>>>>>>> write-barrier.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I could change out the representation to use
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every
>>> time,
>>> >> >>>>>>>>> but
>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the
>>> amount of
>>> >> >>>>>>>>> distinct
>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per
>>> >> >>>>>>>>> object to 2.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to
>>> the
>>> >> >>>>>>>>> array
>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that to
>>> the
>>> >> >>>>>>>>> next array. I
>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm
>>> paying for
>>> >> >>>>>>>>> a card
>>> >> >>>>>>>>> marking table as well, which I don't particularly need with
>>> just
>>> >> >>>>>>>>> two
>>> >> >>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#"
>>> >> >>>>>>>>> machinery added
>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data
>>> >> >>>>>>>>> type, which can
>>> >> >>>>>>>>> speed things up a bit when you don't have very big arrays:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and have
>>> two
>>> >> >>>>>>>>> mutable fields and be able to share the sme write barrier?
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array
>>> types.
>>> >> >>>>>>>>> What
>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with
>>> the
>>> >> >>>>>>>>> impedence
>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and then
>>> just
>>> >> >>>>>>>>> let the
>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld)
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a
>>> >> >>>>>>>>> special
>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can even
>>> >> >>>>>>>>> abuse pattern
>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further to
>>> >> >>>>>>>>> make this
>>> >> >>>>>>>>> cheaper.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and
>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the
>>> preceding
>>> >> >>>>>>>>> and next
>>> >> >>>>>>>>> entry in the linked list.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me'
>>> into a
>>> >> >>>>>>>>> strict world, and everything there lives in #.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> next :: DLL -> IO DLL
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>    (# s', n #) -> (# s', DLL n #)
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that
>>> code to
>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty
>>> >> >>>>>>>>> easily when they
>>> >> >>>>>>>>> are known strict and you chain operations of this sort!
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Cleaning it Up
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> ------------------
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array
>>> that
>>> >> >>>>>>>>> points directly to other arrays.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but I
>>> can
>>> >> >>>>>>>>> fix
>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and
>>> using a
>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me store
>>> a
>>> >> >>>>>>>>> mixture of
>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data
>>> structure.
>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the
>>> >> >>>>>>>>> existing
>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of
>>> the
>>> >> >>>>>>>>> arguments it
>>> >> >>>>>>>>> takes.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields
>>> that
>>> >> >>>>>>>>> would
>>> >> >>>>>>>>> be best left unboxed.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can
>>> currently
>>> >> >>>>>>>>> at
>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at a
>>> >> >>>>>>>>> boxed or at a
>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the
>>> int in
>>> >> >>>>>>>>> question in
>>> >> >>>>>>>>> there.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need
>>> to
>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree.
>>> Having to
>>> >> >>>>>>>>> go off to
>>> >> >>>>>>>>> the side costs me the entire win from avoiding the first
>>> pointer
>>> >> >>>>>>>>> chase.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could
>>> >> >>>>>>>>> construct that had n words with unsafe access and m
>>> pointers to
>>> >> >>>>>>>>> other heap
>>> >> >>>>>>>>> objects, one that could put itself on the mutable list when
>>> any
>>> >> >>>>>>>>> of those
>>> >> >>>>>>>>> pointers changed then I could shed this last factor of two
>>> in
>>> >> >>>>>>>>> all
>>> >> >>>>>>>>> circumstances.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Prototype
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> -------------
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Over the last few days I've put together a small prototype
>>> >> >>>>>>>>> implementation with a few non-trivial imperative data
>>> structures
>>> >> >>>>>>>>> for things
>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and
>>> >> >>>>>>>>> order-maintenance.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> https://github.com/ekmett/structs
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Notable bits:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of
>>> >> >>>>>>>>> link-cut
>>> >> >>>>>>>>> trees in this style.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts
>>> that
>>> >> >>>>>>>>> make
>>> >> >>>>>>>>> it go fast.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core,
>>> almost
>>> >> >>>>>>>>> all
>>> >> >>>>>>>>> the references to the LinkCut or Object data constructor get
>>> >> >>>>>>>>> optimized away,
>>> >> >>>>>>>>> and we're left with beautiful strict code directly mutating
>>> out
>>> >> >>>>>>>>> underlying
>>> >> >>>>>>>>> representation.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a
>>> short
>>> >> >>>>>>>>> article.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> -Edward
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones
>>> >> >>>>>>>>> <simonpj at microsoft.com> wrote:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this
>>> thread.
>>> >> >>>>>>>>> What is ArrayArray?  What is the issue in general?  Is
>>> there a
>>> >> >>>>>>>>> ticket? Is
>>> >> >>>>>>>>> there a wiki page?
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> If it’s important, an ab-initio wiki page + ticket would be
>>> a
>>> >> >>>>>>>>> good
>>> >> >>>>>>>>> thing.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Simon
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On
>>> Behalf
>>> >> >>>>>>>>> Of
>>> >> >>>>>>>>> Edward Kmett
>>> >> >>>>>>>>> Sent: 21 August 2015 05:25
>>> >> >>>>>>>>> To: Manuel M T Chakravarty
>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs
>>> >> >>>>>>>>> Subject: Re: ArrayArrays
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's
>>> would be
>>> >> >>>>>>>>> very handy as well.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Consider right now if I have something like an
>>> order-maintenance
>>> >> >>>>>>>>> structure I have:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s)
>>> {-#
>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s
>>> >> >>>>>>>>> (Upper s))
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s))
>>> {-#
>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s
>>> >> >>>>>>>>> (Lower s)) {-#
>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s))
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two
>>> >> >>>>>>>>> pointers,
>>> >> >>>>>>>>> one for forward and one for backwards. The latter is
>>> basically
>>> >> >>>>>>>>> the same
>>> >> >>>>>>>>> thing with a mutable reference up pointing at the structure
>>> >> >>>>>>>>> above.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> On the heap this is an object that points to a structure
>>> for the
>>> >> >>>>>>>>> bytearray, and points to another structure for each mutvar
>>> which
>>> >> >>>>>>>>> each point
>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of
>>> >> >>>>>>>>> indirection smeared
>>> >> >>>>>>>>> over everything.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward link
>>> >> >>>>>>>>> from
>>> >> >>>>>>>>> the structure below to the structure above.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s)
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#,
>>> and
>>> >> >>>>>>>>> the
>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous
>>> objects,
>>> >> >>>>>>>>> represented
>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use
>>> >> >>>>>>>>> sameMutableArrayArray# on these
>>> >> >>>>>>>>> for object identity, which lets me check for the ends of the
>>> >> >>>>>>>>> lists by tying
>>> >> >>>>>>>>> things back on themselves.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> and below that
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s)
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing
>>> up to
>>> >> >>>>>>>>> an
>>> >> >>>>>>>>> upper structure.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I can then write a handful of combinators for getting out
>>> the
>>> >> >>>>>>>>> slots
>>> >> >>>>>>>>> in question, while it has gained a level of indirection
>>> between
>>> >> >>>>>>>>> the wrapper
>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one
>>> can
>>> >> >>>>>>>>> be basically
>>> >> >>>>>>>>> erased by ghc.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Unlike before I don't have several separate objects on the
>>> heap
>>> >> >>>>>>>>> for
>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for
>>> the
>>> >> >>>>>>>>> object itself,
>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry
>>> around the
>>> >> >>>>>>>>> mutable
>>> >> >>>>>>>>> int.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> The only pain points are
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents me
>>> >> >>>>>>>>> from
>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array
>>> into an
>>> >> >>>>>>>>> ArrayArray
>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of
>>> >> >>>>>>>>> Haskell,
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> and
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid
>>> the
>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4
>>> pointers
>>> >> >>>>>>>>> wide. Card
>>> >> >>>>>>>>> marking doesn't help.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Alternately I could just try to do really evil things and
>>> >> >>>>>>>>> convert
>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to
>>> >> >>>>>>>>> unsafeCoerce my way
>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays
>>> >> >>>>>>>>> directly into the
>>> >> >>>>>>>>> SmallArray as slots, removing the limitation  we see here by
>>> >> >>>>>>>>> aping the
>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really
>>> dangerous!
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the
>>> >> >>>>>>>>> altar
>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move
>>> them
>>> >> >>>>>>>>> and collect
>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> -Edward
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty
>>> >> >>>>>>>>> <chak at cse.unsw.edu.au> wrote:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> That’s an interesting idea.
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> Manuel
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> > Edward Kmett <ekmett at gmail.com>:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> >
>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array#
>>> and
>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the
>>> >> >>>>>>>>> > ArrayArray# entries
>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of indirection
>>> for
>>> >> >>>>>>>>> > the containing
>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it if
>>> my
>>> >> >>>>>>>>> > leaf level data
>>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s.
>>> It'd be
>>> >> >>>>>>>>> > nice to be
>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down
>>> at
>>> >> >>>>>>>>> > the leaves to
>>> >> >>>>>>>>> > hold lifted contents.
>>> >> >>>>>>>>> >
>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to
>>> >> >>>>>>>>> > access
>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do
>>> that
>>> >> >>>>>>>>> > if i tried to
>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as a
>>> >> >>>>>>>>> > ByteArray#
>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story
>>> preventing
>>> >> >>>>>>>>> > this.
>>> >> >>>>>>>>> >
>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection
>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I
>>> >> >>>>>>>>> > could shoehorn a
>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked.
>>> >> >>>>>>>>> >
>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of
>>> unnecessary
>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce that
>>> pain
>>> >> >>>>>>>>> > to just 1
>>> >> >>>>>>>>> > level of unnecessary indirection.
>>> >> >>>>>>>>> >
>>> >> >>>>>>>>> > -Edward
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> > _______________________________________________
>>> >> >>>>>>>>> > ghc-devs mailing list
>>> >> >>>>>>>>> > ghc-devs at haskell.org
>>> >> >>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> _______________________________________________
>>> >> >>>>>>>>> ghc-devs mailing list
>>> >> >>>>>>>>> ghc-devs at haskell.org
>>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>> >> >>>>>>>
>>> >> >>>>>>>
>>> >> >>>>>
>>> >> >>>
>>> >> >>
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > ghc-devs mailing list
>>> >> > ghc-devs at haskell.org
>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>> >> >
>>> >
>>> >
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ghc-devs mailing list
>>> ghc-devs at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150907/248aa963/attachment-0001.html>


More information about the ghc-devs mailing list