<div dir="ltr">I'd love to have that last 10%, but its a lot of work to get there and more importantly I don't know quite what it should look like. <div><br></div><div>On the other hand, I do have a pretty good idea of how the primitives above could be banged out and tested in a long evening, well in time for 7.12. And as noted earlier, those remain useful even if a nicer typed version with an extra level of indirection to the sizes is built up after.<div><br></div><div>The rest sounds like a good graduate student project for someone who has graduate students lying around. Maybe somebody at Indiana University who has an interest in type theory and parallelism can find us one. =)<div><br></div><div>-Edward</div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates <span dir="ltr"><<a href="mailto:fryguybob@gmail.com" target="_blank">fryguybob@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think from my perspective, the motivation for getting the type<br>
checker involved is primarily bringing this to the level where users<br>
could be expected to build these structures. it is reasonable to<br>
think that there are people who want to use STM (a context with<br>
mutation already) to implement a straight forward data structure that<br>
avoids extra indirection penalty. There should be some places where<br>
knowing that things are field accesses rather then array indexing<br>
could be helpful, but I think GHC is good right now about handling<br>
constant offsets. In my code I don't do any bounds checking as I know<br>
I will only be accessing my arrays with constant indexes. I make<br>
wrappers for each field access and leave all the unsafe stuff in<br>
there. When things go wrong though, the compiler is no help. Maybe<br>
template Haskell that generates the appropriate wrappers is the right<br>
direction to go.<br>
There is another benefit for me when working with these as arrays in<br>
that it is quite simple and direct (given the hoops already jumped<br>
through) to play with alignment. I can ensure two pointers are never<br>
on the same cache-line by just spacing things out in the array.<br>
<div class="HOEnZb"><div class="h5"><br>
On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>
> They just segfault at this level. ;)<br>
><br>
> Sent from my iPhone<br>
><br>
> On Aug 28, 2015, at 7:25 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>> wrote:<br>
><br>
> You presumably also save a bounds check on reads by hard-coding the sizes?<br>
><br>
> On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>
>><br>
>> Also there are 4 different "things" here, basically depending on two<br>
>> independent questions:<br>
>><br>
>> a.) if you want to shove the sizes into the info table, and<br>
>> b.) if you want cardmarking.<br>
>><br>
>> Versions with/without cardmarking for different sizes can be done pretty<br>
>> easily, but as noted, the infotable variants are pretty invasive.<br>
>><br>
>> -Edward<br>
>><br>
>> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>
>>><br>
>>> Well, on the plus side you'd save 16 bytes per object, which adds up if<br>
>>> they were small enough and there are enough of them. You get a bit better<br>
>>> locality of reference in terms of what fits in the first cache line of them.<br>
>>><br>
>>> -Edward<br>
>>><br>
>>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>> wrote:<br>
>>>><br>
>>>> Yes. And for the short term I can imagine places we will settle with<br>
>>>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing<br>
>>>> pointers whose types don't actually match their siblings.<br>
>>>><br>
>>>> Is there anything to recommend the hacks mentioned for fixed sized array<br>
>>>> objects *other* than using them to fake structs? (Much to derecommend, as<br>
>>>> you mentioned!)<br>
>>>><br>
>>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>
>>>>><br>
>>>>> I think both are useful, but the one you suggest requires a lot more<br>
>>>>> plumbing and doesn't subsume all of the usecases of the other.<br>
>>>>><br>
>>>>> -Edward<br>
>>>>><br>
>>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>><br>
>>>>> wrote:<br>
>>>>>><br>
>>>>>> So that primitive is an array like thing (Same pointed type, unbounded<br>
>>>>>> length) with extra payload.<br>
>>>>>><br>
>>>>>> I can see how we can do without structs if we have arrays, especially<br>
>>>>>> with the extra payload at front. But wouldn't the general solution for<br>
>>>>>> structs be one that that allows new user data type defs for # types?<br>
>>>>>><br>
>>>>>><br>
>>>>>><br>
>>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br>
>>>>>>><br>
>>>>>>> Some form of MutableStruct# with a known number of words and a known<br>
>>>>>>> number of pointers is basically what Ryan Yates was suggesting above, but<br>
>>>>>>> where the word counts were stored in the objects themselves.<br>
>>>>>>><br>
>>>>>>> Given that it'd have a couple of words for those counts it'd likely<br>
>>>>>>> want to be something we build in addition to MutVar# rather than a<br>
>>>>>>> replacement.<br>
>>>>>>><br>
>>>>>>> On the other hand, if we had to fix those numbers and build info<br>
>>>>>>> tables that knew them, and typechecker support, for instance, it'd get<br>
>>>>>>> rather invasive.<br>
>>>>>>><br>
>>>>>>> Also, a number of things that we can do with the 'sized' versions<br>
>>>>>>> above, like working with evil unsized c-style arrays directly inline at the<br>
>>>>>>> end of the structure cease to be possible, so it isn't even a pure win if we<br>
>>>>>>> did the engineering effort.<br>
>>>>>>><br>
>>>>>>> I think 90% of the needs I have are covered just by adding the one<br>
>>>>>>> primitive. The last 10% gets pretty invasive.<br>
>>>>>>><br>
>>>>>>> -Edward<br>
>>>>>>><br>
>>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton <<a href="mailto:rrnewton@gmail.com">rrnewton@gmail.com</a>><br>
>>>>>>> wrote:<br>
>>>>>>>><br>
>>>>>>>> I like the possibility of a general solution for mutable structs<br>
>>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.<br>
>>>>>>>><br>
>>>>>>>> So, we can't unpack MutVar into constructors because of object<br>
>>>>>>>> identity problems. But what about directly supporting an extensible set of<br>
>>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? That<br>
>>>>>>>> may be too much work, but is it problematic otherwise?<br>
>>>>>>>><br>
>>>>>>>> Needless to say, this is also critical if we ever want best in class<br>
>>>>>>>> lockfree mutable structures, just like their Stm and sequential<br>
>>>>>>>> counterparts.<br>
>>>>>>>><br>
>>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones<br>
>>>>>>>> <<a href="mailto:simonpj@microsoft.com">simonpj@microsoft.com</a>> wrote:<br>
>>>>>>>>><br>
>>>>>>>>> At the very least I'll take this email and turn it into a short<br>
>>>>>>>>> article.<br>
>>>>>>>>><br>
>>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe<br>
>>>>>>>>> make a ticket for it.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Thanks<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Simon<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> From: Edward Kmett [mailto:<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>]<br>
>>>>>>>>> Sent: 27 August 2015 16:54<br>
>>>>>>>>> To: Simon Peyton Jones<br>
>>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs<br>
>>>>>>>>> Subject: Re: ArrayArrays<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It<br>
>>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> While those live in #, they are garbage collected objects, so this<br>
>>>>>>>>> all lives on the heap.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> They were added to make some of the DPH stuff fast when it has to<br>
>>>>>>>>> deal with nested arrays.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I'm currently abusing them as a placeholder for a better thing.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> The Problem<br>
>>>>>>>>><br>
>>>>>>>>> -----------------<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Consider the scenario where you write a classic doubly-linked list<br>
>>>>>>>>> in Haskell.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Chasing from one DLL to the next requires following 3 pointers on<br>
>>>>>>>>> the heap.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe<br>
>>>>>>>>> DLL ~> DLL<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> That is 3 levels of indirection.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> We can trim one by simply unpacking the IORef with<br>
>>>>>>>>> -funbox-strict-fields or UNPACK<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and<br>
>>>>>>>>> worsening our representation.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> but now we're still stuck with a level of indirection<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> This means that every operation we perform on this structure will<br>
>>>>>>>>> be about half of the speed of an implementation in most other languages<br>
>>>>>>>>> assuming we're memory bound on loading things into cache!<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Making Progress<br>
>>>>>>>>><br>
>>>>>>>>> ----------------------<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I have been working on a number of data structures where the<br>
>>>>>>>>> indirection of going from something in * out to an object in # which<br>
>>>>>>>>> contains the real pointer to my target and coming back effectively doubles<br>
>>>>>>>>> my runtime.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> We go out to the MutVar# because we are allowed to put the MutVar#<br>
>>>>>>>>> onto the mutable list when we dirty it. There is a well defined<br>
>>>>>>>>> write-barrier.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I could change out the representation to use<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I can just store two pointers in the MutableArray# every time, but<br>
>>>>>>>>> this doesn't help _much_ directly. It has reduced the amount of distinct<br>
>>>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to 2.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I still have to go out to the heap from my DLL and get to the array<br>
>>>>>>>>> object and then chase it to the next DLL and chase that to the next array. I<br>
>>>>>>>>> do get my two pointers together in memory though. I'm paying for a card<br>
>>>>>>>>> marking table as well, which I don't particularly need with just two<br>
>>>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery added<br>
>>>>>>>>> back in 7.10, which is just the old array code a a new data type, which can<br>
>>>>>>>>> speed things up a bit when you don't have very big arrays:<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> But what if I wanted my object itself to live in # and have two<br>
>>>>>>>>> mutable fields and be able to share the sme write barrier?<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> An ArrayArray# points directly to other unlifted array types. What<br>
>>>>>>>>> if we have one # -> * wrapper on the outside to deal with the impedence<br>
>>>>>>>>> mismatch between the imperative world and Haskell, and then just let the<br>
>>>>>>>>> ArrayArray#'s hold other arrayarrays.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld)<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> now I need to make up a new Nil, which I can just make be a special<br>
>>>>>>>>> MutableArrayArray# I allocate on program startup. I can even abuse pattern<br>
>>>>>>>>> synonyms. Alternately I can exploit the internals further to make this<br>
>>>>>>>>> cheaper.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Then I can use the readMutableArrayArray# and<br>
>>>>>>>>> writeMutableArrayArray# calls to directly access the preceding and next<br>
>>>>>>>>> entry in the linked list.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a<br>
>>>>>>>>> strict world, and everything there lives in #.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> next :: DLL -> IO DLL<br>
>>>>>>>>><br>
>>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of<br>
>>>>>>>>><br>
>>>>>>>>> (# s', n #) -> (# s', DLL n #)<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> It turns out GHC is quite happy to optimize all of that code to<br>
>>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty easily when they<br>
>>>>>>>>> are known strict and you chain operations of this sort!<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Cleaning it Up<br>
>>>>>>>>><br>
>>>>>>>>> ------------------<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Now I have one outermost indirection pointing to an array that<br>
>>>>>>>>> points directly to other arrays.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I'm stuck paying for a card marking table per object, but I can fix<br>
>>>>>>>>> that by duplicating the code for MutableArrayArray# and using a<br>
>>>>>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture of<br>
>>>>>>>>> SmallMutableArray# fields and normal ones in the data structure.<br>
>>>>>>>>> Operationally, I can even do so by just unsafeCoercing the existing<br>
>>>>>>>>> SmallMutableArray# primitives to change the kind of one of the arguments it<br>
>>>>>>>>> takes.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> This is almost ideal, but not quite. I often have fields that would<br>
>>>>>>>>> be best left unboxed.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> was able to unpack the Int, but we lost that. We can currently at<br>
>>>>>>>>> best point one of the entries of the SmallMutableArray# at a boxed or at a<br>
>>>>>>>>> MutableByteArray# for all of our misc. data and shove the int in question in<br>
>>>>>>>>> there.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to<br>
>>>>>>>>> store masks and administrivia as I walk down the tree. Having to go off to<br>
>>>>>>>>> the side costs me the entire win from avoiding the first pointer chase.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> But, if like Ryan suggested, we had a heap object we could<br>
>>>>>>>>> construct that had n words with unsafe access and m pointers to other heap<br>
>>>>>>>>> objects, one that could put itself on the mutable list when any of those<br>
>>>>>>>>> pointers changed then I could shed this last factor of two in all<br>
>>>>>>>>> circumstances.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Prototype<br>
>>>>>>>>><br>
>>>>>>>>> -------------<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Over the last few days I've put together a small prototype<br>
>>>>>>>>> implementation with a few non-trivial imperative data structures for things<br>
>>>>>>>>> like Tarjan's link-cut trees, the list labeling problem and<br>
>>>>>>>>> order-maintenance.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> <a href="https://github.com/ekmett/structs" rel="noreferrer" target="_blank">https://github.com/ekmett/structs</a><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Notable bits:<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation of link-cut<br>
>>>>>>>>> trees in this style.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Data.Struct.Internal provides the rather horrifying guts that make<br>
>>>>>>>>> it go fast.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all<br>
>>>>>>>>> the references to the LinkCut or Object data constructor get optimized away,<br>
>>>>>>>>> and we're left with beautiful strict code directly mutating out underlying<br>
>>>>>>>>> representation.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> At the very least I'll take this email and turn it into a short<br>
>>>>>>>>> article.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> -Edward<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones<br>
>>>>>>>>> <<a href="mailto:simonpj@microsoft.com">simonpj@microsoft.com</a>> wrote:<br>
>>>>>>>>><br>
>>>>>>>>> Just to say that I have no idea what is going on in this thread.<br>
>>>>>>>>> What is ArrayArray? What is the issue in general? Is there a ticket? Is<br>
>>>>>>>>> there a wiki page?<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> If it’s important, an ab-initio wiki page + ticket would be a good<br>
>>>>>>>>> thing.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Simon<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> From: ghc-devs [mailto:<a href="mailto:ghc-devs-bounces@haskell.org">ghc-devs-bounces@haskell.org</a>] On Behalf Of<br>
>>>>>>>>> Edward Kmett<br>
>>>>>>>>> Sent: 21 August 2015 05:25<br>
>>>>>>>>> To: Manuel M T Chakravarty<br>
>>>>>>>>> Cc: Simon Marlow; ghc-devs<br>
>>>>>>>>> Subject: Re: ArrayArrays<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be<br>
>>>>>>>>> very handy as well.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Consider right now if I have something like an order-maintenance<br>
>>>>>>>>> structure I have:<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-#<br>
>>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s))<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-#<br>
>>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-#<br>
>>>>>>>>> UNPACK #-} !(MutVar s (Lower s))<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> The former contains, logically, a mutable integer and two pointers,<br>
>>>>>>>>> one for forward and one for backwards. The latter is basically the same<br>
>>>>>>>>> thing with a mutable reference up pointing at the structure above.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> On the heap this is an object that points to a structure for the<br>
>>>>>>>>> bytearray, and points to another structure for each mutvar which each point<br>
>>>>>>>>> to the other 'Upper' structure. So there is a level of indirection smeared<br>
>>>>>>>>> over everything.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> So this is a pair of doubly linked lists with an upward link from<br>
>>>>>>>>> the structure below to the structure above.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Converted into ArrayArray#s I'd get<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data Upper s = Upper (MutableArrayArray# s)<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the<br>
>>>>>>>>> next 2 slots pointing to the previous and next previous objects, represented<br>
>>>>>>>>> just as their MutableArrayArray#s. I can use sameMutableArrayArray# on these<br>
>>>>>>>>> for object identity, which lets me check for the ends of the lists by tying<br>
>>>>>>>>> things back on themselves.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> and below that<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> data Lower s = Lower (MutableArrayArray# s)<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an<br>
>>>>>>>>> upper structure.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I can then write a handful of combinators for getting out the slots<br>
>>>>>>>>> in question, while it has gained a level of indirection between the wrapper<br>
>>>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can be basically<br>
>>>>>>>>> erased by ghc.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Unlike before I don't have several separate objects on the heap for<br>
>>>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object itself,<br>
>>>>>>>>> and the MutableByteArray# that it references to carry around the mutable<br>
>>>>>>>>> int.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> The only pain points are<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> 1.) the aforementioned limitation that currently prevents me from<br>
>>>>>>>>> stuffing normal boxed data through a SmallArray or Array into an ArrayArray<br>
>>>>>>>>> leaving me in a little ghetto disconnected from the rest of Haskell,<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> and<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the<br>
>>>>>>>>> card marking overhead. These objects are all small, 3-4 pointers wide. Card<br>
>>>>>>>>> marking doesn't help.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> Alternately I could just try to do really evil things and convert<br>
>>>>>>>>> the whole mess to SmallArrays and then figure out how to unsafeCoerce my way<br>
>>>>>>>>> to glory, stuffing the #'d references to the other arrays directly into the<br>
>>>>>>>>> SmallArray as slots, removing the limitation we see here by aping the<br>
>>>>>>>>> MutableArrayArray# s API, but that gets really really dangerous!<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar<br>
>>>>>>>>> of speed here, but I'd like to be able to let the GC move them and collect<br>
>>>>>>>>> them which rules out simpler Ptr and Addr based solutions.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> -Edward<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty<br>
>>>>>>>>> <<a href="mailto:chak@cse.unsw.edu.au">chak@cse.unsw.edu.au</a>> wrote:<br>
>>>>>>>>><br>
>>>>>>>>> That’s an interesting idea.<br>
>>>>>>>>><br>
>>>>>>>>> Manuel<br>
>>>>>>>>><br>
>>>>>>>>> > Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>>:<br>
>>>>>>>>><br>
>>>>>>>>> ><br>
>>>>>>>>> > Would it be possible to add unsafe primops to add Array# and<br>
>>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries<br>
>>>>>>>>> > are all directly unlifted avoiding a level of indirection for the containing<br>
>>>>>>>>> > structure is amazing, but I can only currently use it if my leaf level data<br>
>>>>>>>>> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be<br>
>>>>>>>>> > able to have the ability to put SmallArray# a stuff down at the leaves to<br>
>>>>>>>>> > hold lifted contents.<br>
>>>>>>>>> ><br>
>>>>>>>>> > I accept fully that if I name the wrong type when I go to access<br>
>>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd do that if i tried to<br>
>>>>>>>>> > use one of the members that held a nested ArrayArray# as a ByteArray#<br>
>>>>>>>>> > anyways, so it isn't like there is a safety story preventing this.<br>
>>>>>>>>> ><br>
>>>>>>>>> > I've been hunting for ways to try to kill the indirection<br>
>>>>>>>>> > problems I get with Haskell and mutable structures, and I could shoehorn a<br>
>>>>>>>>> > number of them into ArrayArrays if this worked.<br>
>>>>>>>>> ><br>
>>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary<br>
>>>>>>>>> > indirection compared to c/java and this could reduce that pain to just 1<br>
>>>>>>>>> > level of unnecessary indirection.<br>
>>>>>>>>> ><br>
>>>>>>>>> > -Edward<br>
>>>>>>>>><br>
>>>>>>>>> > _______________________________________________<br>
>>>>>>>>> > ghc-devs mailing list<br>
>>>>>>>>> > <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
>>>>>>>>> > <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> _______________________________________________<br>
>>>>>>>>> ghc-devs mailing list<br>
>>>>>>>>> <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
>>>>>>>>> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
>>>>>>><br>
>>>>>>><br>
>>>>><br>
>>><br>
>><br>
><br>
><br>
> _______________________________________________<br>
> ghc-devs mailing list<br>
> <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
><br>
</div></div></blockquote></div><br></div>