<div dir="ltr">I'd love to have that last 10%, but its a lot of work to get there and more importantly I don't know quite what it should look like. <div><br></div><div>On the other hand, I do have a pretty good idea of how the primitives above could be banged out and tested in a long evening, well in time for 7.12. And as noted earlier, those remain useful even if a nicer typed version with an extra level of indirection to the sizes is built up after.<div><br></div><div>The rest sounds like a good graduate student project for someone who has graduate students lying around. Maybe somebody at Indiana University who has an interest in type theory and parallelism can find us one. =)<div><br></div><div>-Edward</div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates <span dir="ltr"><<a href="mailto:fryguybob@gmail.com" target="_blank">fryguybob@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think from my perspective, the motivation for getting the type<br>

checker involved is primarily bringing this to the level where users<br>

could be expected to build these structures.  it is reasonable to<br>

think that there are people who want to use STM (a context with<br>

mutation already) to implement a straight forward data structure that<br>

avoids extra indirection penalty.  There should be some places where<br>

knowing that things are field accesses rather then array indexing<br>

could be helpful, but I think GHC is good right now about handling<br>

constant offsets.  In my code I don't do any bounds checking as I know<br>

I will only be accessing my arrays with constant indexes.  I make<br>

wrappers for each field access and leave all the unsafe stuff in<br>

there.  When things go wrong though, the compiler is no help.  Maybe<br>

template Haskell that generates the appropriate wrappers is the right<br>

direction to go.<br>

There is another benefit for me when working with these as arrays in<br>

that it is quite simple and direct (given the hoops already jumped<br>

through) to play with alignment.  I can ensure two pointers are never<br>

on the same cache-line by just spacing things out in the array.<br>

<div class="HOEnZb"><div class="h5"><br>

On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>

> They just segfault at this level. ;)<br>

><br>

> Sent from my iPhone<br>

><br>

> On Aug 28, 2015, at 7:25 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>> wrote:<br>

><br>

> You presumably also save a bounds check on reads by hard-coding the sizes?<br>

><br>

> On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>

>><br>

>> Also there are 4 different "things" here, basically depending on two<br>

>> independent questions:<br>

>><br>

>> a.) if you want to shove the sizes into the info table, and<br>

>> b.) if you want cardmarking.<br>

>><br>

>> Versions with/without cardmarking for different sizes can be done pretty<br>

>> easily, but as noted, the infotable variants are pretty invasive.<br>

>><br>

>> -Edward<br>

>><br>

>> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>

>>><br>

>>> Well, on the plus side you'd save 16 bytes per object, which adds up if<br>

>>> they were small enough and there are enough of them. You get a bit better<br>

>>> locality of reference in terms of what fits in the first cache line of them.<br>

>>><br>

>>> -Edward<br>

>>><br>

>>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>> wrote:<br>

>>>><br>

>>>> Yes. And for the short term I can imagine places we will settle with<br>

>>>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing<br>

>>>> pointers whose types don't actually match their siblings.<br>

>>>><br>

>>>> Is there anything to recommend the hacks mentioned for fixed sized array<br>

>>>> objects *other* than using them to fake structs? (Much to derecommend, as<br>

>>>> you mentioned!)<br>

>>>><br>

>>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>

>>>>><br>

>>>>> I think both are useful, but the one you suggest requires a lot more<br>

>>>>> plumbing and doesn't subsume all of the usecases of the other.<br>

>>>>><br>

>>>>> -Edward<br>

>>>>><br>

>>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>><br>

>>>>> wrote:<br>

>>>>>><br>

>>>>>> So that primitive is an array like thing (Same pointed type, unbounded<br>

>>>>>> length) with extra payload.<br>

>>>>>><br>

>>>>>> I can see how we can do without structs if we have arrays, especially<br>

>>>>>> with the extra payload at front. But wouldn't the general solution for<br>

>>>>>> structs be one that that allows new user data type defs for # types?<br>

>>>>>><br>

>>>>>><br>

>>>>>><br>

>>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>

>>>>>>><br>

>>>>>>> Some form of MutableStruct# with a known number of words and a known<br>

>>>>>>> number of pointers is basically what Ryan Yates was suggesting above, but<br>

>>>>>>> where the word counts were stored in the objects themselves.<br>

>>>>>>><br>

>>>>>>> Given that it'd have a couple of words for those counts it'd likely<br>

>>>>>>> want to be something we build in addition to MutVar# rather than a<br>

>>>>>>> replacement.<br>

>>>>>>><br>

>>>>>>> On the other hand, if we had to fix those numbers and build info<br>

>>>>>>> tables that knew them, and typechecker support, for instance, it'd get<br>

>>>>>>> rather invasive.<br>

>>>>>>><br>

>>>>>>> Also, a number of things that we can do with the 'sized' versions<br>

>>>>>>> above, like working with evil unsized c-style arrays directly inline at the<br>

>>>>>>> end of the structure cease to be possible, so it isn't even a pure win if we<br>

>>>>>>> did the engineering effort.<br>

>>>>>>><br>

>>>>>>> I think 90% of the needs I have are covered just by adding the one<br>

>>>>>>> primitive. The last 10% gets pretty invasive.<br>

>>>>>>><br>

>>>>>>> -Edward<br>

>>>>>>><br>

>>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>><br>

>>>>>>> wrote:<br>

>>>>>>>><br>

>>>>>>>> I like the possibility of a general solution for mutable structs<br>

>>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.<br>

>>>>>>>><br>

>>>>>>>> So, we can't unpack MutVar into constructors because of object<br>

>>>>>>>> identity problems. But what about directly supporting an extensible set of<br>

>>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? That<br>

>>>>>>>> may be too much work, but is it problematic otherwise?<br>

>>>>>>>><br>

>>>>>>>> Needless to say, this is also critical if we ever want best in class<br>

>>>>>>>> lockfree mutable structures, just like their Stm and sequential<br>

>>>>>>>> counterparts.<br>

>>>>>>>><br>

>>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones<br>

>>>>>>>> <<a href="mailto:simonpj@microsoft.com">simonpj@microsoft.com</a>> wrote:<br>

>>>>>>>>><br>

>>>>>>>>> At the very least I'll take this email and turn it into a short<br>

>>>>>>>>> article.<br>

>>>>>>>>><br>

>>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe<br>

>>>>>>>>> make a ticket for it.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Thanks<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Simon<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> From: Edward Kmett [mailto:<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>]<br>

>>>>>>>>> Sent: 27 August 2015 16:54<br>

>>>>>>>>> To: Simon Peyton Jones<br>

>>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs<br>

>>>>>>>>> Subject: Re: ArrayArrays<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It<br>

>>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> While those live in #, they are garbage collected objects, so this<br>

>>>>>>>>> all lives on the heap.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> They were added to make some of the DPH stuff fast when it has to<br>

>>>>>>>>> deal with nested arrays.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I'm currently abusing them as a placeholder for a better thing.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> The Problem<br>

>>>>>>>>><br>

>>>>>>>>> -----------------<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Consider the scenario where you write a classic doubly-linked list<br>

>>>>>>>>> in Haskell.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Chasing from one DLL to the next requires following 3 pointers on<br>

>>>>>>>>> the heap.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe<br>

>>>>>>>>> DLL ~> DLL<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> That is 3 levels of indirection.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> We can trim one by simply unpacking the IORef with<br>

>>>>>>>>> -funbox-strict-fields or UNPACK<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and<br>

>>>>>>>>> worsening our representation.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> but now we're still stuck with a level of indirection<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> This means that every operation we perform on this structure will<br>

>>>>>>>>> be about half of the speed of an implementation in most other languages<br>

>>>>>>>>> assuming we're memory bound on loading things into cache!<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Making Progress<br>

>>>>>>>>><br>

>>>>>>>>> ----------------------<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I have been working on a number of data structures where the<br>

>>>>>>>>> indirection of going from something in * out to an object in # which<br>

>>>>>>>>> contains the real pointer to my target and coming back effectively doubles<br>

>>>>>>>>> my runtime.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> We go out to the MutVar# because we are allowed to put the MutVar#<br>

>>>>>>>>> onto the mutable list when we dirty it. There is a well defined<br>

>>>>>>>>> write-barrier.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I could change out the representation to use<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I can just store two pointers in the MutableArray# every time, but<br>

>>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of distinct<br>

>>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I still have to go out to the heap from my DLL and get to the array<br>

>>>>>>>>> object and then chase it to the next DLL and chase that to the next array. I<br>

>>>>>>>>> do get my two pointers together in memory though. I'm paying for a card<br>

>>>>>>>>> marking table as well, which I don't particularly need with just two<br>

>>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery added<br>

>>>>>>>>> back in 7.10, which is just the old array code a a new data type, which can<br>

>>>>>>>>> speed things up a bit when you don't have very big arrays:<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> But what if I wanted my object itself to live in # and have two<br>

>>>>>>>>> mutable fields and be able to share the sme write barrier?<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> An ArrayArray# points directly to other unlifted array types. What<br>

>>>>>>>>> if we have one # -> * wrapper on the outside to deal with the impedence<br>

>>>>>>>>> mismatch between the imperative world and Haskell, and then just let the<br>

>>>>>>>>> ArrayArray#'s hold other arrayarrays.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld)<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> now I need to make up a new Nil, which I can just make be a special<br>

>>>>>>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern<br>

>>>>>>>>> synonyms. Alternately I can exploit the internals further to make this<br>

>>>>>>>>> cheaper.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Then I can use the readMutableArrayArray# and<br>

>>>>>>>>> writeMutableArrayArray# calls to directly access the preceding and next<br>

>>>>>>>>> entry in the linked list.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a<br>

>>>>>>>>> strict world, and everything there lives in #.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> next :: DLL -> IO DLL<br>

>>>>>>>>><br>

>>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of<br>

>>>>>>>>><br>

>>>>>>>>>    (# s', n #) -> (# s', DLL n #)<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> It turns out GHC is quite happy to optimize all of that code to<br>

>>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty easily when they<br>

>>>>>>>>> are known strict and you chain operations of this sort!<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Cleaning it Up<br>

>>>>>>>>><br>

>>>>>>>>> ------------------<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Now I have one outermost indirection pointing to an array that<br>

>>>>>>>>> points directly to other arrays.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I'm stuck paying for a card marking table per object, but I can fix<br>

>>>>>>>>> that by duplicating the code for MutableArrayArray# and using a<br>

>>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of<br>

>>>>>>>>> SmallMutableArray# fields and normal ones in the data structure.<br>

>>>>>>>>> Operationally, I can even do so by just unsafeCoercing the existing<br>

>>>>>>>>> SmallMutableArray# primitives to change the kind of one of the arguments it<br>

>>>>>>>>> takes.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> This is almost ideal, but not quite. I often have fields that would<br>

>>>>>>>>> be best left unboxed.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> was able to unpack the Int, but we lost that. We can currently at<br>

>>>>>>>>> best point one of the entries of the SmallMutableArray# at a boxed or at a<br>

>>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in question in<br>

>>>>>>>>> there.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to<br>

>>>>>>>>> store masks and administrivia as I walk down the tree. Having to go off to<br>

>>>>>>>>> the side costs me the entire win from avoiding the first pointer chase.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> But, if like Ryan suggested, we had a heap object we could<br>

>>>>>>>>> construct that had n words with unsafe access and m pointers to other heap<br>

>>>>>>>>> objects, one that could put itself on the mutable list when any of those<br>

>>>>>>>>> pointers changed then I could shed this last factor of two in all<br>

>>>>>>>>> circumstances.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Prototype<br>

>>>>>>>>><br>

>>>>>>>>> -------------<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Over the last few days I've put together a small prototype<br>

>>>>>>>>> implementation with a few non-trivial imperative data structures for things<br>

>>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and<br>

>>>>>>>>> order-maintenance.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> <a href="https://github.com/ekmett/structs" rel="noreferrer" target="_blank">https://github.com/ekmett/structs</a><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Notable bits:<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of link-cut<br>

>>>>>>>>> trees in this style.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that make<br>

>>>>>>>>> it go fast.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all<br>

>>>>>>>>> the references to the LinkCut or Object data constructor get optimized away,<br>

>>>>>>>>> and we're left with beautiful strict code directly mutating out underlying<br>

>>>>>>>>> representation.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> At the very least I'll take this email and turn it into a short<br>

>>>>>>>>> article.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> -Edward<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones<br>

>>>>>>>>> <<a href="mailto:simonpj@microsoft.com">simonpj@microsoft.com</a>> wrote:<br>

>>>>>>>>><br>

>>>>>>>>> Just to say that I have no idea what is going on in this thread.<br>

>>>>>>>>> What is ArrayArray?  What is the issue in general?  Is there a ticket? Is<br>

>>>>>>>>> there a wiki page?<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> If it’s important, an ab-initio wiki page + ticket would be a good<br>

>>>>>>>>> thing.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Simon<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> From: ghc-devs [mailto:<a href="mailto:ghc-devs-bounces@haskell.org">ghc-devs-bounces@haskell.org</a>] On Behalf Of<br>

>>>>>>>>> Edward Kmett<br>

>>>>>>>>> Sent: 21 August 2015 05:25<br>

>>>>>>>>> To: Manuel M T Chakravarty<br>

>>>>>>>>> Cc: Simon Marlow; ghc-devs<br>

>>>>>>>>> Subject: Re: ArrayArrays<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be<br>

>>>>>>>>> very handy as well.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Consider right now if I have something like an order-maintenance<br>

>>>>>>>>> structure I have:<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-#<br>

>>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s))<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-#<br>

>>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-#<br>

>>>>>>>>> UNPACK #-} !(MutVar s (Lower s))<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> The former contains, logically, a mutable integer and two pointers,<br>

>>>>>>>>> one for forward and one for backwards. The latter is basically the same<br>

>>>>>>>>> thing with a mutable reference up pointing at the structure above.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> On the heap this is an object that points to a structure for the<br>

>>>>>>>>> bytearray, and points to another structure for each mutvar which each point<br>

>>>>>>>>> to the other 'Upper' structure. So there is a level of indirection smeared<br>

>>>>>>>>> over everything.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> So this is a pair of doubly linked lists with an upward link from<br>

>>>>>>>>> the structure below to the structure above.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Converted into ArrayArray#s I'd get<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data Upper s = Upper (MutableArrayArray# s)<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the<br>

>>>>>>>>> next 2 slots pointing to the previous and next previous objects, represented<br>

>>>>>>>>> just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these<br>

>>>>>>>>> for object identity, which lets me check for the ends of the lists by tying<br>

>>>>>>>>> things back on themselves.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> and below that<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> data Lower s = Lower (MutableArrayArray# s)<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an<br>

>>>>>>>>> upper structure.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I can then write a handful of combinators for getting out the slots<br>

>>>>>>>>> in question, while it has gained a level of indirection between the wrapper<br>

>>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can be basically<br>

>>>>>>>>> erased by ghc.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Unlike before I don't have several separate objects on the heap for<br>

>>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object itself,<br>

>>>>>>>>> and the MutableByteArray# that it references to carry around the mutable<br>

>>>>>>>>> int.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> The only pain points are<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> 1.) the aforementioned limitation that currently prevents me from<br>

>>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray<br>

>>>>>>>>> leaving me in a little ghetto disconnected from the rest of Haskell,<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> and<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the<br>

>>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers wide. Card<br>

>>>>>>>>> marking doesn't help.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> Alternately I could just try to do really evil things and convert<br>

>>>>>>>>> the whole mess to SmallArrays and then figure out how to unsafeCoerce my way<br>

>>>>>>>>> to glory, stuffing the #'d references to the other arrays directly into the<br>

>>>>>>>>> SmallArray as slots, removing the limitation  we see here by aping the<br>

>>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous!<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar<br>

>>>>>>>>> of speed here, but I'd like to be able to let the GC move them and collect<br>

>>>>>>>>> them which rules out simpler Ptr and Addr based solutions.<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> -Edward<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty<br>

>>>>>>>>> <<a href="mailto:chak@cse.unsw.edu.au">chak@cse.unsw.edu.au</a>> wrote:<br>

>>>>>>>>><br>

>>>>>>>>> That’s an interesting idea.<br>

>>>>>>>>><br>

>>>>>>>>> Manuel<br>

>>>>>>>>><br>

>>>>>>>>> > Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>>:<br>

>>>>>>>>><br>

>>>>>>>>> ><br>

>>>>>>>>> > Would it be possible to add unsafe primops to add Array# and<br>

>>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries<br>

>>>>>>>>> > are all directly unlifted avoiding a level of indirection for the containing<br>

>>>>>>>>> > structure is amazing, but I can only currently use it if my leaf level data<br>

>>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be<br>

>>>>>>>>> > able to have the ability to put SmallArray# a stuff down at the leaves to<br>

>>>>>>>>> > hold lifted contents.<br>

>>>>>>>>> ><br>

>>>>>>>>> > I accept fully that if I name the wrong type when I go to access<br>

>>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that if i tried to<br>

>>>>>>>>> > use one of the members that held a nested ArrayArray# as a ByteArray#<br>

>>>>>>>>> > anyways, so it isn't like there is a safety story preventing this.<br>

>>>>>>>>> ><br>

>>>>>>>>> > I've been hunting for ways to try to kill the indirection<br>

>>>>>>>>> > problems I get with Haskell and mutable structures, and I could shoehorn a<br>

>>>>>>>>> > number of them into ArrayArrays if this worked.<br>

>>>>>>>>> ><br>

>>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary<br>

>>>>>>>>> > indirection compared to c/java and this could reduce that pain to just 1<br>

>>>>>>>>> > level of unnecessary indirection.<br>

>>>>>>>>> ><br>

>>>>>>>>> > -Edward<br>

>>>>>>>>><br>

>>>>>>>>> > _______________________________________________<br>

>>>>>>>>> > ghc-devs mailing list<br>

>>>>>>>>> > <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

>>>>>>>>> > <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> _______________________________________________<br>

>>>>>>>>> ghc-devs mailing list<br>

>>>>>>>>> <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

>>>>>>>>> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

>>>>>>><br>

>>>>>>><br>

>>>>><br>

>>><br>

>><br>

><br>

><br>

> _______________________________________________<br>

> ghc-devs mailing list<br>

> <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

><br>

</div></div></blockquote></div><br></div>