ArrayArrays

Fri Aug 21 13:49:47 UTC 2015

Hi Edward,

I've been working on removing indirection in STM and I added a heap
object like SmallArray, but with a mix of words and pointers (as well
as a header with metadata for STM).  It appears to work well now, but
it is missing the type information.  All the pointers have the same
type which works fine for your Upper.  In my case I use it to
represent a red-black tree node [1].

Also all the structures I make are fixed size and it would be nice if
the compiler could treat that fix size like a constant in code
generation.  I don't know what the right design is or what would be
needed, but it seems simple enough to give the right typing
information to something like this and basically get a mutable struct.
I'm talking about this work at HIW and really hope to find someone
interested in extending this expressiveness to let us write something
that looks clear in Haskell, but gives the heap representation that we
really need for performance.  From the RTS perspective I think there
are any obstacles.

[1]: https://github.com/fryguybob/ghc-stm-benchmarks/blob/master/benchmarks/RBTree-Throughput/RBTreeNode.hs

Ryan

On Fri, Aug 21, 2015 at 12:25 AM, Edward Kmett <ekmett at gmail.com> wrote:
> When (ab)using them for this purpose, SmallArrayArray's would be very handy
> as well.
>
> Consider right now if I have something like an order-maintenance structure I
> have:
>
> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-}
> !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s))
>
> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-}
> !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-}
> !(MutVar s (Lower s))
>
> The former contains, logically, a mutable integer and two pointers, one for
> forward and one for backwards. The latter is basically the same thing with a
> mutable reference up pointing at the structure above.
>
> On the heap this is an object that points to a structure for the bytearray,
> and points to another structure for each mutvar which each point to the
> other 'Upper' structure. So there is a level of indirection smeared over
> everything.
>
> So this is a pair of doubly linked lists with an upward link from the
> structure below to the structure above.
>
> Converted into ArrayArray#s I'd get
>
> data Upper s = Upper (MutableArrayArray# s)
>
> w/ the first slot being a pointer to a MutableByteArray#, and the next 2
> slots pointing to the previous and next previous objects, represented just
> as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for
> object identity, which lets me check for the ends of the lists by tying
> things back on themselves.
>
> and below that
>
> data Lower s = Lower (MutableArrayArray# s)
>
> is similar, with an extra MutableArrayArray slot pointing up to an upper
> structure.
>
> I can then write a handful of combinators for getting out the slots in
> question, while it has gained a level of indirection between the wrapper to
> put it in * and the MutableArrayArray# s in #, that one can be basically
> erased by ghc.
>
> Unlike before I don't have several separate objects on the heap for each
> thing. I only have 2 now. The MutableArrayArray# for the object itself, and
> the MutableByteArray# that it references to carry around the mutable int.
>
> The only pain points are
>
> 1.) the aforementioned limitation that currently prevents me from stuffing
> normal boxed data through a SmallArray or Array into an ArrayArray leaving
> me in a little ghetto disconnected from the rest of Haskell,
>
> and
>
> 2.) the lack of SmallArrayArray's, which could let us avoid the card marking
> overhead. These objects are all small, 3-4 pointers wide. Card marking
> doesn't help.
>
> Alternately I could just try to do really evil things and convert the whole
> mess to SmallArrays and then figure out how to unsafeCoerce my way to glory,
> stuffing the #'d references to the other arrays directly into the SmallArray
> as slots, removing the limitation  we see here by aping the
> MutableArrayArray# s API, but that gets really really dangerous!
>
> I'm pretty much willing to sacrifice almost anything on the altar of speed
> here, but I'd like to be able to let the GC move them and collect them which
> rules out simpler Ptr and Addr based solutions.
>
> -Edward
>
> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty
> <chak at cse.unsw.edu.au> wrote:
>>
>> That’s an interesting idea.
>>
>> Manuel
>>
>> > Edward Kmett <ekmett at gmail.com>:
>> >
>> > Would it be possible to add unsafe primops to add Array# and SmallArray#
>> > entries to an ArrayArray#? The fact that the ArrayArray# entries are all
>> > directly unlifted avoiding a level of indirection for the containing
>> > structure is amazing, but I can only currently use it if my leaf level data
>> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be
>> > able to have the ability to put SmallArray# a stuff down at the leaves to
>> > hold lifted contents.
>> >
>> > I accept fully that if I name the wrong type when I go to access one of
>> > the fields it'll lie to me, but I suppose it'd do that if i tried to use one
>> > of the members that held a nested ArrayArray# as a ByteArray# anyways, so it
>> > isn't like there is a safety story preventing this.
>> >
>> > I've been hunting for ways to try to kill the indirection problems I get
>> > with Haskell and mutable structures, and I could shoehorn a number of them
>> > into ArrayArrays if this worked.
>> >
>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection
>> > compared to c/java and this could reduce that pain to just 1 level of
>> > unnecessary indirection.
>> >
>> > -Edward
>> > _______________________________________________
>> > ghc-devs mailing list
>> > ghc-devs at haskell.org
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>