ArrayArrays

Edward Kmett ekmett at gmail.com
Sun Aug 30 04:24:58 UTC 2015


Without a custom primitive it doesn't help much there, you have to store
the indirection to the mask.

With a custom primitive it should cut the on heap root-to-leaf path of
everything in the HAMT in half. A shorter HashMap was actually one of the
motivating factors for me doing this. It is rather astoundingly difficult
to beat the performance of HashMap, so I had to start cheating pretty
badly. ;)

-Edward

On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell <johan.tibell at gmail.com>
wrote:

> I'd also be interested to chat at ICFP to see if I can use this for my
> HAMT implementation.
>
> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett <ekmett at gmail.com> wrote:
>
>> Sounds good to me. Right now I'm just hacking up composable accessors for
>> "typed slots" in a fairly lens-like fashion, and treating the set of slots
>> I define and the 'new' function I build for the data type as its API, and
>> build atop that. This could eventually graduate to template-haskell, but
>> I'm not entirely satisfied with the solution I have. I currently
>> distinguish between what I'm calling "slots" (things that point directly to
>> another SmallMutableArrayArray# sans wrapper) and "fields" which point
>> directly to the usual Haskell data types because unifying the two notions
>> meant that I couldn't lift some coercions out "far enough" to make them
>> vanish.
>>
>> I'll be happy to run through my current working set of issues in person
>> and -- as things get nailed down further -- in a longer lived medium than
>> in personal conversations. ;)
>>
>> -Edward
>>
>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton <rrnewton at gmail.com> wrote:
>>
>>> I'd also love to meet up at ICFP and discuss this.  I think the array
>>> primops plus a TH layer that lets (ab)use them many times without too much
>>> marginal cost sounds great.  And I'd like to learn how we could be either
>>> early users of, or help with, this infrastructure.
>>>
>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in
>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student
>>> who is currently working on concurrent data structures in Haskell, but will
>>> not be at ICFP.
>>>
>>>
>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates <fryguybob at gmail.com> wrote:
>>>
>>>> I completely agree.  I would love to spend some time during ICFP and
>>>> friends talking about what it could look like.  My small array for STM
>>>> changes for the RTS can be seen here [1].  It is on a branch somewhere
>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some
>>>> confusing naming choices (sorry), but should cover all the details
>>>> needed to implement it for a non-STM context.  The biggest surprise
>>>> for me was following small array too closely and having a word/byte
>>>> offset miss-match [2].
>>>>
>>>> [1]:
>>>> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413
>>>>
>>>> Ryan
>>>>
>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett <ekmett at gmail.com>
>>>> wrote:
>>>> > I'd love to have that last 10%, but its a lot of work to get there
>>>> and more
>>>> > importantly I don't know quite what it should look like.
>>>> >
>>>> > On the other hand, I do have a pretty good idea of how the primitives
>>>> above
>>>> > could be banged out and tested in a long evening, well in time for
>>>> 7.12. And
>>>> > as noted earlier, those remain useful even if a nicer typed version
>>>> with an
>>>> > extra level of indirection to the sizes is built up after.
>>>> >
>>>> > The rest sounds like a good graduate student project for someone who
>>>> has
>>>> > graduate students lying around. Maybe somebody at Indiana University
>>>> who has
>>>> > an interest in type theory and parallelism can find us one. =)
>>>> >
>>>> > -Edward
>>>> >
>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates <fryguybob at gmail.com>
>>>> wrote:
>>>> >>
>>>> >> I think from my perspective, the motivation for getting the type
>>>> >> checker involved is primarily bringing this to the level where users
>>>> >> could be expected to build these structures.  it is reasonable to
>>>> >> think that there are people who want to use STM (a context with
>>>> >> mutation already) to implement a straight forward data structure that
>>>> >> avoids extra indirection penalty.  There should be some places where
>>>> >> knowing that things are field accesses rather then array indexing
>>>> >> could be helpful, but I think GHC is good right now about handling
>>>> >> constant offsets.  In my code I don't do any bounds checking as I
>>>> know
>>>> >> I will only be accessing my arrays with constant indexes.  I make
>>>> >> wrappers for each field access and leave all the unsafe stuff in
>>>> >> there.  When things go wrong though, the compiler is no help.  Maybe
>>>> >> template Haskell that generates the appropriate wrappers is the right
>>>> >> direction to go.
>>>> >> There is another benefit for me when working with these as arrays in
>>>> >> that it is quite simple and direct (given the hoops already jumped
>>>> >> through) to play with alignment.  I can ensure two pointers are never
>>>> >> on the same cache-line by just spacing things out in the array.
>>>> >>
>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett <ekmett at gmail.com>
>>>> wrote:
>>>> >> > They just segfault at this level. ;)
>>>> >> >
>>>> >> > Sent from my iPhone
>>>> >> >
>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton <rrnewton at gmail.com>
>>>> wrote:
>>>> >> >
>>>> >> > You presumably also save a bounds check on reads by hard-coding the
>>>> >> > sizes?
>>>> >> >
>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <ekmett at gmail.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> Also there are 4 different "things" here, basically depending on
>>>> two
>>>> >> >> independent questions:
>>>> >> >>
>>>> >> >> a.) if you want to shove the sizes into the info table, and
>>>> >> >> b.) if you want cardmarking.
>>>> >> >>
>>>> >> >> Versions with/without cardmarking for different sizes can be done
>>>> >> >> pretty
>>>> >> >> easily, but as noted, the infotable variants are pretty invasive.
>>>> >> >>
>>>> >> >> -Edward
>>>> >> >>
>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <ekmett at gmail.com>
>>>> wrote:
>>>> >> >>>
>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which
>>>> adds up
>>>> >> >>> if
>>>> >> >>> they were small enough and there are enough of them. You get a
>>>> bit
>>>> >> >>> better
>>>> >> >>> locality of reference in terms of what fits in the first cache
>>>> line of
>>>> >> >>> them.
>>>> >> >>>
>>>> >> >>> -Edward
>>>> >> >>>
>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton <rrnewton at gmail.com
>>>> >
>>>> >> >>> wrote:
>>>> >> >>>>
>>>> >> >>>> Yes. And for the short term I can imagine places we will settle
>>>> with
>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
>>>> >> >>>> unsafeCoercing
>>>> >> >>>> pointers whose types don't actually match their siblings.
>>>> >> >>>>
>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed
>>>> sized
>>>> >> >>>> array
>>>> >> >>>> objects *other* than using them to fake structs? (Much to
>>>> >> >>>> derecommend, as
>>>> >> >>>> you mentioned!)
>>>> >> >>>>
>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett <ekmett at gmail.com>
>>>> >> >>>> wrote:
>>>> >> >>>>>
>>>> >> >>>>> I think both are useful, but the one you suggest requires a
>>>> lot more
>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other.
>>>> >> >>>>>
>>>> >> >>>>> -Edward
>>>> >> >>>>>
>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton <
>>>> rrnewton at gmail.com>
>>>> >> >>>>> wrote:
>>>> >> >>>>>>
>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type,
>>>> >> >>>>>> unbounded
>>>> >> >>>>>> length) with extra payload.
>>>> >> >>>>>>
>>>> >> >>>>>> I can see how we can do without structs if we have arrays,
>>>> >> >>>>>> especially
>>>> >> >>>>>> with the extra payload at front. But wouldn't the general
>>>> solution
>>>> >> >>>>>> for
>>>> >> >>>>>> structs be one that that allows new user data type defs for #
>>>> >> >>>>>> types?
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>>
>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett <
>>>> ekmett at gmail.com>
>>>> >> >>>>>> wrote:
>>>> >> >>>>>>>
>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and
>>>> a
>>>> >> >>>>>>> known
>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was
>>>> suggesting
>>>> >> >>>>>>> above, but
>>>> >> >>>>>>> where the word counts were stored in the objects themselves.
>>>> >> >>>>>>>
>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd
>>>> >> >>>>>>> likely
>>>> >> >>>>>>> want to be something we build in addition to MutVar# rather
>>>> than a
>>>> >> >>>>>>> replacement.
>>>> >> >>>>>>>
>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build
>>>> info
>>>> >> >>>>>>> tables that knew them, and typechecker support, for
>>>> instance, it'd
>>>> >> >>>>>>> get
>>>> >> >>>>>>> rather invasive.
>>>> >> >>>>>>>
>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized'
>>>> versions
>>>> >> >>>>>>> above, like working with evil unsized c-style arrays directly
>>>> >> >>>>>>> inline at the
>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even
>>>> a pure
>>>> >> >>>>>>> win if we
>>>> >> >>>>>>> did the engineering effort.
>>>> >> >>>>>>>
>>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding
>>>> the one
>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive.
>>>> >> >>>>>>>
>>>> >> >>>>>>> -Edward
>>>> >> >>>>>>>
>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton <
>>>> rrnewton at gmail.com>
>>>> >> >>>>>>> wrote:
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> I like the possibility of a general solution for mutable
>>>> structs
>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's
>>>> hard.
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of
>>>> object
>>>> >> >>>>>>>> identity problems. But what about directly supporting an
>>>> >> >>>>>>>> extensible set of
>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even
>>>> replacing)
>>>> >> >>>>>>>> MutVar#? That
>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise?
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best
>>>> in
>>>> >> >>>>>>>> class
>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and
>>>> sequential
>>>> >> >>>>>>>> counterparts.
>>>> >> >>>>>>>>
>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones
>>>> >> >>>>>>>> <simonpj at microsoft.com> wrote:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a
>>>> short
>>>> >> >>>>>>>>> article.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac,
>>>> and
>>>> >> >>>>>>>>> maybe
>>>> >> >>>>>>>>> make a ticket for it.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Thanks
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Simon
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> From: Edward Kmett [mailto:ekmett at gmail.com]
>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54
>>>> >> >>>>>>>>> To: Simon Peyton Jones
>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>>> >> >>>>>>>>> Subject: Re: ArrayArrays
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified
>>>> invariant. It
>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or
>>>> ByteArray#'s.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> While those live in #, they are garbage collected objects,
>>>> so
>>>> >> >>>>>>>>> this
>>>> >> >>>>>>>>> all lives on the heap.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when it
>>>> has
>>>> >> >>>>>>>>> to
>>>> >> >>>>>>>>> deal with nested arrays.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a better
>>>> thing.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> The Problem
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> -----------------
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Consider the scenario where you write a classic
>>>> doubly-linked
>>>> >> >>>>>>>>> list
>>>> >> >>>>>>>>> in Haskell.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3
>>>> pointers
>>>> >> >>>>>>>>> on
>>>> >> >>>>>>>>> the heap.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL)
>>>> ~>
>>>> >> >>>>>>>>> Maybe
>>>> >> >>>>>>>>> DLL ~> DLL
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> That is 3 levels of indirection.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with
>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL
>>>> and
>>>> >> >>>>>>>>> worsening our representation.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> This means that every operation we perform on this
>>>> structure
>>>> >> >>>>>>>>> will
>>>> >> >>>>>>>>> be about half of the speed of an implementation in most
>>>> other
>>>> >> >>>>>>>>> languages
>>>> >> >>>>>>>>> assuming we're memory bound on loading things into cache!
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Making Progress
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> ----------------------
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I have been working on a number of data structures where
>>>> the
>>>> >> >>>>>>>>> indirection of going from something in * out to an object
>>>> in #
>>>> >> >>>>>>>>> which
>>>> >> >>>>>>>>> contains the real pointer to my target and coming back
>>>> >> >>>>>>>>> effectively doubles
>>>> >> >>>>>>>>> my runtime.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put the
>>>> >> >>>>>>>>> MutVar#
>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well
>>>> defined
>>>> >> >>>>>>>>> write-barrier.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I could change out the representation to use
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray# every
>>>> time,
>>>> >> >>>>>>>>> but
>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the
>>>> amount of
>>>> >> >>>>>>>>> distinct
>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per
>>>> >> >>>>>>>>> object to 2.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get to
>>>> the
>>>> >> >>>>>>>>> array
>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that to
>>>> the
>>>> >> >>>>>>>>> next array. I
>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm
>>>> paying for
>>>> >> >>>>>>>>> a card
>>>> >> >>>>>>>>> marking table as well, which I don't particularly need
>>>> with just
>>>> >> >>>>>>>>> two
>>>> >> >>>>>>>>> pointers, but we can shed that with the
>>>> "SmallMutableArray#"
>>>> >> >>>>>>>>> machinery added
>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new data
>>>> >> >>>>>>>>> type, which can
>>>> >> >>>>>>>>> speed things up a bit when you don't have very big arrays:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and
>>>> have two
>>>> >> >>>>>>>>> mutable fields and be able to share the sme write barrier?
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array
>>>> types.
>>>> >> >>>>>>>>> What
>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal with
>>>> the
>>>> >> >>>>>>>>> impedence
>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and
>>>> then just
>>>> >> >>>>>>>>> let the
>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld)
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make be a
>>>> >> >>>>>>>>> special
>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can
>>>> even
>>>> >> >>>>>>>>> abuse pattern
>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals further
>>>> to
>>>> >> >>>>>>>>> make this
>>>> >> >>>>>>>>> cheaper.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and
>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the
>>>> preceding
>>>> >> >>>>>>>>> and next
>>>> >> >>>>>>>>> entry in the linked list.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me'
>>>> into a
>>>> >> >>>>>>>>> strict world, and everything there lives in #.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> next :: DLL -> IO DLL
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>    (# s', n #) -> (# s', DLL n #)
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that
>>>> code to
>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty
>>>> >> >>>>>>>>> easily when they
>>>> >> >>>>>>>>> are known strict and you chain operations of this sort!
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Cleaning it Up
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> ------------------
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an array
>>>> that
>>>> >> >>>>>>>>> points directly to other arrays.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object, but
>>>> I can
>>>> >> >>>>>>>>> fix
>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and
>>>> using a
>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me
>>>> store a
>>>> >> >>>>>>>>> mixture of
>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data
>>>> structure.
>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing the
>>>> >> >>>>>>>>> existing
>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one of
>>>> the
>>>> >> >>>>>>>>> arguments it
>>>> >> >>>>>>>>> takes.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have fields
>>>> that
>>>> >> >>>>>>>>> would
>>>> >> >>>>>>>>> be best left unboxed.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can
>>>> currently
>>>> >> >>>>>>>>> at
>>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray# at
>>>> a
>>>> >> >>>>>>>>> boxed or at a
>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove the
>>>> int in
>>>> >> >>>>>>>>> question in
>>>> >> >>>>>>>>> there.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I
>>>> need to
>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree.
>>>> Having to
>>>> >> >>>>>>>>> go off to
>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the first
>>>> pointer
>>>> >> >>>>>>>>> chase.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we could
>>>> >> >>>>>>>>> construct that had n words with unsafe access and m
>>>> pointers to
>>>> >> >>>>>>>>> other heap
>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list
>>>> when any
>>>> >> >>>>>>>>> of those
>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of two
>>>> in
>>>> >> >>>>>>>>> all
>>>> >> >>>>>>>>> circumstances.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Prototype
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> -------------
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Over the last few days I've put together a small prototype
>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data
>>>> structures
>>>> >> >>>>>>>>> for things
>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and
>>>> >> >>>>>>>>> order-maintenance.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> https://github.com/ekmett/structs
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Notable bits:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of
>>>> >> >>>>>>>>> link-cut
>>>> >> >>>>>>>>> trees in this style.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying guts
>>>> that
>>>> >> >>>>>>>>> make
>>>> >> >>>>>>>>> it go fast.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core,
>>>> almost
>>>> >> >>>>>>>>> all
>>>> >> >>>>>>>>> the references to the LinkCut or Object data constructor
>>>> get
>>>> >> >>>>>>>>> optimized away,
>>>> >> >>>>>>>>> and we're left with beautiful strict code directly
>>>> mutating out
>>>> >> >>>>>>>>> underlying
>>>> >> >>>>>>>>> representation.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a
>>>> short
>>>> >> >>>>>>>>> article.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> -Edward
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones
>>>> >> >>>>>>>>> <simonpj at microsoft.com> wrote:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in this
>>>> thread.
>>>> >> >>>>>>>>> What is ArrayArray?  What is the issue in general?  Is
>>>> there a
>>>> >> >>>>>>>>> ticket? Is
>>>> >> >>>>>>>>> there a wiki page?
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> If it’s important, an ab-initio wiki page + ticket would
>>>> be a
>>>> >> >>>>>>>>> good
>>>> >> >>>>>>>>> thing.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Simon
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On
>>>> Behalf
>>>> >> >>>>>>>>> Of
>>>> >> >>>>>>>>> Edward Kmett
>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25
>>>> >> >>>>>>>>> To: Manuel M T Chakravarty
>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs
>>>> >> >>>>>>>>> Subject: Re: ArrayArrays
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's
>>>> would be
>>>> >> >>>>>>>>> very handy as well.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Consider right now if I have something like an
>>>> order-maintenance
>>>> >> >>>>>>>>> structure I have:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s)
>>>> {-#
>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s
>>>> >> >>>>>>>>> (Upper s))
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s))
>>>> {-#
>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s
>>>> >> >>>>>>>>> (Lower s)) {-#
>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s))
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and two
>>>> >> >>>>>>>>> pointers,
>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is
>>>> basically
>>>> >> >>>>>>>>> the same
>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the structure
>>>> >> >>>>>>>>> above.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> On the heap this is an object that points to a structure
>>>> for the
>>>> >> >>>>>>>>> bytearray, and points to another structure for each mutvar
>>>> which
>>>> >> >>>>>>>>> each point
>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of
>>>> >> >>>>>>>>> indirection smeared
>>>> >> >>>>>>>>> over everything.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward
>>>> link
>>>> >> >>>>>>>>> from
>>>> >> >>>>>>>>> the structure below to the structure above.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s)
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#,
>>>> and
>>>> >> >>>>>>>>> the
>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous
>>>> objects,
>>>> >> >>>>>>>>> represented
>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use
>>>> >> >>>>>>>>> sameMutableArrayArray# on these
>>>> >> >>>>>>>>> for object identity, which lets me check for the ends of
>>>> the
>>>> >> >>>>>>>>> lists by tying
>>>> >> >>>>>>>>> things back on themselves.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> and below that
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s)
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot pointing
>>>> up to
>>>> >> >>>>>>>>> an
>>>> >> >>>>>>>>> upper structure.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I can then write a handful of combinators for getting out
>>>> the
>>>> >> >>>>>>>>> slots
>>>> >> >>>>>>>>> in question, while it has gained a level of indirection
>>>> between
>>>> >> >>>>>>>>> the wrapper
>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one
>>>> can
>>>> >> >>>>>>>>> be basically
>>>> >> >>>>>>>>> erased by ghc.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on the
>>>> heap
>>>> >> >>>>>>>>> for
>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for
>>>> the
>>>> >> >>>>>>>>> object itself,
>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry
>>>> around the
>>>> >> >>>>>>>>> mutable
>>>> >> >>>>>>>>> int.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> The only pain points are
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently prevents
>>>> me
>>>> >> >>>>>>>>> from
>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or Array
>>>> into an
>>>> >> >>>>>>>>> ArrayArray
>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the rest of
>>>> >> >>>>>>>>> Haskell,
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> and
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us
>>>> avoid the
>>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4
>>>> pointers
>>>> >> >>>>>>>>> wide. Card
>>>> >> >>>>>>>>> marking doesn't help.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Alternately I could just try to do really evil things and
>>>> >> >>>>>>>>> convert
>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how to
>>>> >> >>>>>>>>> unsafeCoerce my way
>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other arrays
>>>> >> >>>>>>>>> directly into the
>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation  we see here
>>>> by
>>>> >> >>>>>>>>> aping the
>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really
>>>> dangerous!
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on the
>>>> >> >>>>>>>>> altar
>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC move
>>>> them
>>>> >> >>>>>>>>> and collect
>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based solutions.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> -Edward
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty
>>>> >> >>>>>>>>> <chak at cse.unsw.edu.au> wrote:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> That’s an interesting idea.
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> Manuel
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> > Edward Kmett <ekmett at gmail.com>:
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> >
>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add Array#
>>>> and
>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the
>>>> >> >>>>>>>>> > ArrayArray# entries
>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of
>>>> indirection for
>>>> >> >>>>>>>>> > the containing
>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it if
>>>> my
>>>> >> >>>>>>>>> > leaf level data
>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s.
>>>> It'd be
>>>> >> >>>>>>>>> > nice to be
>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff down
>>>> at
>>>> >> >>>>>>>>> > the leaves to
>>>> >> >>>>>>>>> > hold lifted contents.
>>>> >> >>>>>>>>> >
>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I go to
>>>> >> >>>>>>>>> > access
>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do
>>>> that
>>>> >> >>>>>>>>> > if i tried to
>>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray# as
>>>> a
>>>> >> >>>>>>>>> > ByteArray#
>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story
>>>> preventing
>>>> >> >>>>>>>>> > this.
>>>> >> >>>>>>>>> >
>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the indirection
>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures, and I
>>>> >> >>>>>>>>> > could shoehorn a
>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked.
>>>> >> >>>>>>>>> >
>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of
>>>> unnecessary
>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce
>>>> that pain
>>>> >> >>>>>>>>> > to just 1
>>>> >> >>>>>>>>> > level of unnecessary indirection.
>>>> >> >>>>>>>>> >
>>>> >> >>>>>>>>> > -Edward
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> > _______________________________________________
>>>> >> >>>>>>>>> > ghc-devs mailing list
>>>> >> >>>>>>>>> > ghc-devs at haskell.org
>>>> >> >>>>>>>>> >
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>>
>>>> >> >>>>>>>>> _______________________________________________
>>>> >> >>>>>>>>> ghc-devs mailing list
>>>> >> >>>>>>>>> ghc-devs at haskell.org
>>>> >> >>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>> >> >>>>>>>
>>>> >> >>>>>>>
>>>> >> >>>>>
>>>> >> >>>
>>>> >> >>
>>>> >> >
>>>> >> >
>>>> >> > _______________________________________________
>>>> >> > ghc-devs mailing list
>>>> >> > ghc-devs at haskell.org
>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>> >> >
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150829/9e63bb56/attachment-0001.html>


More information about the ghc-devs mailing list