ArrayArrays

Tue Sep 8 11:50:27 UTC 2015

I'm not willing to fight too hard for it, but it feels more like the "right" solution than retaining a cut-and-paste copy of the same code and bifurcating further on each argument you want to consider such a degree of freedom.

Like I say, I’m not against allowing polymorphism over unlifted-but-boxed types, and I can see the advantages.    But it’s a separate proposal in its own right.

Simon

From: Edward Kmett [mailto:ekmett at gmail.com]
Sent: 08 September 2015 09:30
To: Simon Marlow
Cc: Simon Peyton Jones; Ryan Newton; Johan Tibell; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates
Subject: Re: ArrayArrays

Once you start to include all the other primitive types there is a bit more of an explosion. MVar#, TVar#, MutVar#, Small variants, etc. can all be modified to carry unlifted content.

Being able to be parametric over that choice would permit a number of things in user land to do the same thing with an open-ended set of design possibilities that are rather hard to contemplate in advance. e.g. being able to abstract over them could let you just use a normal (,) to carry around unlifted parametric data types or being able to talk about [MVar# s a] drastically reducing the number of one off data types we need to invent.

If you can talk about the machinery mentioned above then you can have typeclasses parameterized on an argument that could be either unlifted or lifted.

I'm not willing to fight too hard for it, but it feels more like the "right" solution than retaining a cut-and-paste copy of the same code and bifurcating further on each argument you want to consider such a degree of freedom.

As such it seems like a pretty big win for a comparatively minor change to the levity polymorphism machinery.

-Edward

On Tue, Sep 8, 2015 at 3:40 AM, Simon Marlow <marlowsd at gmail.com<mailto:marlowsd at gmail.com>> wrote:
This would be very cool, however it's questionable whether it's worth it.

Without any unlifted kind, we need
 - ArrayArray#
 - a set of new/read/write primops for every element type,
   either built-in or made from unsafeCoerce#

With the unlifted kind, we would need
 - ArrayArray#
 - one set of new/read/write primops

With levity polymorphism, we would need
 - none of this, Array# can be used

So having an unlifted kind already kills a lot of the duplication, polymorphism only kills a bit more.

Cheers
Simon

On 08/09/2015 00:14, Edward Kmett wrote:
Assume we had the ability to talk about Levity in a new way and instead
of just:

data Levity = Lifted | Unlifted

type * = TYPE 'Lifted
type # = TYPE 'Unlifted

we replace had a more nuanced notion of TYPE parameterized on another
data type:

data Levity = Lifted | Unlifted
data Param = Composite | Simple Levity

and we parameterized TYPE with a Param rather than Levity.

Existing strange representations can continue to live in TYPE 'Composite

(# Int# , Double #) :: TYPE 'Composite

and we don't support parametricity in there, just like, currently we
don't allow parametricity in #.

We can include the undefined example from Richard's talk:

undefined :: forall (v :: Param). v

and ultimately lift it into his pi type when it is available just as before.

But we could let consider TYPE ('Simple 'Unlifted) as a form of
'parametric #' covering unlifted things we're willing to allow
polymorphism over because they are just pointers to something in the
heap, that just happens to not be able to be _|_ or a thunk.

In this setting, recalling that above, I modified Richard's TYPE to take
a Param instead of Levity, we can define a type alias for things that
live as a simple pointer to a heap allocated object:

type GC (l :: Levity) = TYPE ('Simple l)
type * = GC 'Lifted

and then we can look at existing primitives generalized:

Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted
MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted
SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC
'Unlifted
MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted

Weak#, StablePtr#, StableName#, etc. all can take similar modifications.

Recall that an ArrayArray# was just an Array# hacked up to be able to
hold onto the subset of # that is collectable.

Almost all of the operations on these data types can work on the more
general kind of argument.

newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a ->
State# s -> (# State# s, MutableArray# s a #)

writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray#
s a -> Int# -> a -> State# s -> State# s

readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s
a -> Int# -> State# s -> (# State# s, a #)

etc.

Only a couple of our existing primitives _can't_ generalize this way.
The one that leaps to mind is atomicModifyMutVar, which would need to
stay constrained to only work on arguments in *, because of the way it
operates.

With that we can still talk about

MutableArray# s Int

but now we can also talk about:

MutableArray# s (MutableArray# s Int)

without the layer of indirection through a box in * and without an
explosion of primops. The same newFoo, readFoo, writeFoo machinery works
for both kinds.

The struct machinery doesn't get to take advantage of this, but it would
let us clean house elsewhere in Prim and drastically improve the range
of applicability of the existing primitives with nothing more than a
small change to the levity machinery.

I'm not attached to any of the names above, I coined them just to give
us a concrete thing to talk about.

Here I'm only proposing we extend machinery in GHC.Prim this way, but an
interesting 'now that the barn door is open' question is to consider
that our existing Haskell data types often admit a similar form of
parametricity and nothing in principle prevents this from working for
Maybe or [] and once you permit inference to fire across all of GC l
then it seems to me that you'd start to get those same capabilities
there as well when LevityPolymorphism was turned on.

-Edward

On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones
<simonpj at microsoft.com<mailto:simonpj at microsoft.com> <mailto:simonpj at microsoft.com<mailto:simonpj at microsoft.com>>> wrote:

    This could make the menagerie of ways to pack
    {Small}{Mutable}Array{Array}# references into a
    {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing
    the need for folks to descend into the use of the more evil
    structure primitives we're talking about, and letting us keep a few
    more principles around us.____

    __ __

    I’m lost. Can you give some concrete examples that illustrate how
    levity polymorphism will help us?____

    Simon____

    __ __

    *From:*Edward Kmett [mailto:ekmett at gmail.com<mailto:ekmett at gmail.com> <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>]
    *Sent:* 07 September 2015 21:17
    *To:* Simon Peyton Jones
    *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T
    Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates
    *Subject:* Re: ArrayArrays____

    __ __

    I had a brief discussion with Richard during the Haskell Symposium
    about how we might be able to let parametricity help a bit in
    reducing the space of necessarily primops to a slightly more
    manageable level. ____

    __ __

    Notably, it'd be interesting to explore the ability to allow
    parametricity over the portion of # that is just a gcptr.____

    __ __

    We could do this if the levity polymorphism machinery was tweaked a
    bit. You could envision the ability to abstract over things in both
    * and the subset of # that are represented by a gcptr, then
    modifying the existing array primitives to be parametric in that
    choice of levity for their argument so long as it was of a "heap
    object" levity.____

    __ __

    This could make the menagerie of ways to pack
    {Small}{Mutable}Array{Array}# references into a
    {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing
    the need for folks to descend into the use of the more evil
    structure primitives we're talking about, and letting us keep a few
    more principles around us.____

    __ __

    Then in the cases like `atomicModifyMutVar#` where it needs to
    actually be in * rather than just a gcptr, due to the constructed
    field selectors it introduces on the heap then we could keep the
    existing less polymorphic type.____

    __ __

    -Edward____

    __ __

    On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones
    <simonpj at microsoft.com<mailto:simonpj at microsoft.com> <mailto:simonpj at microsoft.com<mailto:simonpj at microsoft.com>>> wrote:____

        It was fun to meet and discuss this.____

        ____

        Did someone volunteer to write a wiki page that describes the
        proposed design?  And, I earnestly hope, also describes the
        menagerie of currently available array types and primops so that
        users can have some chance of picking the right one?!____

        ____

        Thanks____

        ____

        Simon____

        ____

        *From:*ghc-devs [mailto:ghc-devs-bounces at haskell.org<mailto:ghc-devs-bounces at haskell.org>
        <mailto:ghc-devs-bounces at haskell.org<mailto:ghc-devs-bounces at haskell.org>>] *On Behalf Of *Ryan Newton
        *Sent:* 31 August 2015 23:11
        *To:* Edward Kmett; Johan Tibell
        *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen;
        ghc-devs; Ryan Scott; Ryan Yates
        *Subject:* Re: ArrayArrays____

        ____

        Dear Edward, Ryan Yates, and other interested parties -- ____

        ____

        So when should we meet up about this?____

        ____

        May I propose the Tues afternoon break for everyone at ICFP who
        is interested in this topic?  We can meet out in the coffee area
        and congregate around Edward Kmett, who is tall and should be
        easy to find ;-).____

        ____

        I think Ryan is going to show us how to use his new primops for
        combined array + other fields in one heap object?____

        ____

        On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
        <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>> wrote:____

            Without a custom primitive it doesn't help much there, you
            have to store the indirection to the mask.____

            ____

            With a custom primitive it should cut the on heap
            root-to-leaf path of everything in the HAMT in half. A
            shorter HashMap was actually one of the motivating factors
            for me doing this. It is rather astoundingly difficult to
            beat the performance of HashMap, so I had to start cheating
            pretty badly. ;)____

            ____

            -Edward____

            ____

            On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell
            <johan.tibell at gmail.com<mailto:johan.tibell at gmail.com> <mailto:johan.tibell at gmail.com<mailto:johan.tibell at gmail.com>>>
            wrote:____

                I'd also be interested to chat at ICFP to see if I can
                use this for my HAMT implementation.____

                ____

                On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett
                <ekmett at gmail.com<mailto:ekmett at gmail.com> <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>> wrote:____

                    Sounds good to me. Right now I'm just hacking up
                    composable accessors for "typed slots" in a fairly
                    lens-like fashion, and treating the set of slots I
                    define and the 'new' function I build for the data
                    type as its API, and build atop that. This could
                    eventually graduate to template-haskell, but I'm not
                    entirely satisfied with the solution I have. I
                    currently distinguish between what I'm calling
                    "slots" (things that point directly to another
                    SmallMutableArrayArray# sans wrapper) and "fields"
                    which point directly to the usual Haskell data types
                    because unifying the two notions meant that I
                    couldn't lift some coercions out "far enough" to
                    make them vanish.____

                    ____

                    I'll be happy to run through my current working set
                    of issues in person and -- as things get nailed down
                    further -- in a longer lived medium than in personal
                    conversations. ;)____

                    ____

                    -Edward____

                    ____

                    On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton
                    <rrnewton at gmail.com<mailto:rrnewton at gmail.com> <mailto:rrnewton at gmail.com<mailto:rrnewton at gmail.com>>>
                    wrote:____

                        I'd also love to meet up at ICFP and discuss
                        this.  I think the array primops plus a TH layer
                        that lets (ab)use them many times without too
                        much marginal cost sounds great.  And I'd like
                        to learn how we could be either early users of,
                        or help with, this infrastructure.____

                        ____

                        CC'ing in Ryan Scot and Omer Agacan who may also
                        be interested in dropping in on such discussions
                        @ICFP, and Chao-Hong Chen, a Ph.D. student who
                        is currently working on concurrent data
                        structures in Haskell, but will not be at ICFP.____

                        ____

                        ____

                        On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates
                        <fryguybob at gmail.com<mailto:fryguybob at gmail.com>
                        <mailto:fryguybob at gmail.com<mailto:fryguybob at gmail.com>>> wrote:____

                            I completely agree.  I would love to spend
                            some time during ICFP and
                            friends talking about what it could look
                            like.  My small array for STM
                            changes for the RTS can be seen here [1].
                            It is on a branch somewhere
                            between 7.8 and 7.10 and includes irrelevant
                            STM bits and some
                            confusing naming choices (sorry), but should
                            cover all the details
                            needed to implement it for a non-STM
                            context.  The biggest surprise
                            for me was following small array too closely
                            and having a word/byte
                            offset miss-match [2].

                            [1]:
                            https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
                            [2]:
                            https://ghc.haskell.org/trac/ghc/ticket/10413

                            Ryan____

                            On Fri, Aug 28, 2015 at 10:09 PM, Edward
                            Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>> wrote:
                             > I'd love to have that last 10%, but its a
                            lot of work to get there and more
                             > importantly I don't know quite what it
                            should look like.
                             >
                             > On the other hand, I do have a pretty
                            good idea of how the primitives above
                             > could be banged out and tested in a long
                            evening, well in time for 7.12. And
                             > as noted earlier, those remain useful
                            even if a nicer typed version with an
                             > extra level of indirection to the sizes
                            is built up after.
                             >
                             > The rest sounds like a good graduate
                            student project for someone who has
                             > graduate students lying around. Maybe
                            somebody at Indiana University who has
                             > an interest in type theory and
                            parallelism can find us one. =)
                             >
                             > -Edward
                             >
                             > On Fri, Aug 28, 2015 at 8:48 PM, Ryan
                            Yates <fryguybob at gmail.com<mailto:fryguybob at gmail.com>
                            <mailto:fryguybob at gmail.com<mailto:fryguybob at gmail.com>>> wrote:
                             >>
                             >> I think from my perspective, the
                            motivation for getting the type
                             >> checker involved is primarily bringing
                            this to the level where users
                             >> could be expected to build these
                            structures.  it is reasonable to
                             >> think that there are people who want to
                            use STM (a context with
                             >> mutation already) to implement a
                            straight forward data structure that
                             >> avoids extra indirection penalty.  There
                            should be some places where
                             >> knowing that things are field accesses
                            rather then array indexing
                             >> could be helpful, but I think GHC is
                            good right now about handling
                             >> constant offsets.  In my code I don't do
                            any bounds checking as I know
                             >> I will only be accessing my arrays with
                            constant indexes.  I make
                             >> wrappers for each field access and leave
                            all the unsafe stuff in
                             >> there.  When things go wrong though, the
                            compiler is no help.  Maybe
                             >> template Haskell that generates the
                            appropriate wrappers is the right
                             >> direction to go.
                             >> There is another benefit for me when
                            working with these as arrays in
                             >> that it is quite simple and direct
                            (given the hoops already jumped
                             >> through) to play with alignment.  I can
                            ensure two pointers are never
                             >> on the same cache-line by just spacing
                            things out in the array.
                             >>
                             >> On Fri, Aug 28, 2015 at 7:33 PM, Edward
                            Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>> wrote:
                             >> > They just segfault at this level. ;)
                             >> >
                             >> > Sent from my iPhone
                             >> >
                             >> > On Aug 28, 2015, at 7:25 PM, Ryan
                            Newton <rrnewton at gmail.com<mailto:rrnewton at gmail.com>
                            <mailto:rrnewton at gmail.com<mailto:rrnewton at gmail.com>>> wrote:
                             >> >
                             >> > You presumably also save a bounds
                            check on reads by hard-coding the
                             >> > sizes?
                             >> >
                             >> > On Fri, Aug 28, 2015 at 3:39 PM,
                            Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>> wrote:
                             >> >>
                             >> >> Also there are 4 different "things"
                            here, basically depending on two
                             >> >> independent questions:
                             >> >>
                             >> >> a.) if you want to shove the sizes
                            into the info table, and
                             >> >> b.) if you want cardmarking.
                             >> >>
                             >> >> Versions with/without cardmarking for
                            different sizes can be done
                             >> >> pretty
                             >> >> easily, but as noted, the infotable
                            variants are pretty invasive.
                             >> >>
                             >> >> -Edward
                             >> >>
                             >> >> On Fri, Aug 28, 2015 at 6:36 PM,
                            Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>> wrote:
                             >> >>>
                             >> >>> Well, on the plus side you'd save 16
                            bytes per object, which adds up
                             >> >>> if
                             >> >>> they were small enough and there are
                            enough of them. You get a bit
                             >> >>> better
                             >> >>> locality of reference in terms of
                            what fits in the first cache line of
                             >> >>> them.
                             >> >>>
                             >> >>> -Edward
                             >> >>>
                             >> >>> On Fri, Aug 28, 2015 at 6:14 PM,
                            Ryan Newton <rrnewton at gmail.com<mailto:rrnewton at gmail.com>
                            <mailto:rrnewton at gmail.com<mailto:rrnewton at gmail.com>>>
                             >> >>> wrote:
                             >> >>>>
                             >> >>>> Yes. And for the short term I can
                            imagine places we will settle with
                             >> >>>> arrays even if it means tracking
                            lengths unnecessarily and
                             >> >>>> unsafeCoercing
                             >> >>>> pointers whose types don't actually
                            match their siblings.
                             >> >>>>
                             >> >>>> Is there anything to recommend the
                            hacks mentioned for fixed sized
                             >> >>>> array
                             >> >>>> objects *other* than using them to
                            fake structs? (Much to
                             >> >>>> derecommend, as
                             >> >>>> you mentioned!)
                             >> >>>>
                             >> >>>> On Fri, Aug 28, 2015 at 3:07 PM
                            Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>>
                             >> >>>> wrote:
                             >> >>>>>
                             >> >>>>> I think both are useful, but the
                            one you suggest requires a lot more
                             >> >>>>> plumbing and doesn't subsume all
                            of the usecases of the other.
                             >> >>>>>
                             >> >>>>> -Edward
                             >> >>>>>
                             >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM,
                            Ryan Newton <rrnewton at gmail.com<mailto:rrnewton at gmail.com>
                            <mailto:rrnewton at gmail.com<mailto:rrnewton at gmail.com>>>
                             >> >>>>> wrote:
                             >> >>>>>>
                             >> >>>>>> So that primitive is an array
                            like thing (Same pointed type,
                             >> >>>>>> unbounded
                             >> >>>>>> length) with extra payload.
                             >> >>>>>>
                             >> >>>>>> I can see how we can do without
                            structs if we have arrays,
                             >> >>>>>> especially
                             >> >>>>>> with the extra payload at front.
                            But wouldn't the general solution
                             >> >>>>>> for
                             >> >>>>>> structs be one that that allows
                            new user data type defs for #
                             >> >>>>>> types?
                             >> >>>>>>
                             >> >>>>>>
                             >> >>>>>>
                             >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM
                            Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>>
                             >> >>>>>> wrote:
                             >> >>>>>>>
                             >> >>>>>>> Some form of MutableStruct# with
                            a known number of words and a
                             >> >>>>>>> known
                             >> >>>>>>> number of pointers is basically
                            what Ryan Yates was suggesting
                             >> >>>>>>> above, but
                             >> >>>>>>> where the word counts were
                            stored in the objects themselves.
                             >> >>>>>>>
                             >> >>>>>>> Given that it'd have a couple of
                            words for those counts it'd
                             >> >>>>>>> likely
                             >> >>>>>>> want to be something we build in
                            addition to MutVar# rather than a
                             >> >>>>>>> replacement.
                             >> >>>>>>>
                             >> >>>>>>> On the other hand, if we had to
                            fix those numbers and build info
                             >> >>>>>>> tables that knew them, and
                            typechecker support, for instance, it'd
                             >> >>>>>>> get
                             >> >>>>>>> rather invasive.
                             >> >>>>>>>
                             >> >>>>>>> Also, a number of things that we
                            can do with the 'sized' versions
                             >> >>>>>>> above, like working with evil
                            unsized c-style arrays directly
                             >> >>>>>>> inline at the
                             >> >>>>>>> end of the structure cease to be
                            possible, so it isn't even a pure
                             >> >>>>>>> win if we
                             >> >>>>>>> did the engineering effort.
                             >> >>>>>>>
                             >> >>>>>>> I think 90% of the needs I have
                            are covered just by adding the one
                             >> >>>>>>> primitive. The last 10% gets
                            pretty invasive.
                             >> >>>>>>>
                             >> >>>>>>> -Edward
                             >> >>>>>>>
                             >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM,
                            Ryan Newton <rrnewton at gmail.com<mailto:rrnewton at gmail.com>
                            <mailto:rrnewton at gmail.com<mailto:rrnewton at gmail.com>>>
                             >> >>>>>>> wrote:
                             >> >>>>>>>>
                             >> >>>>>>>> I like the possibility of a
                            general solution for mutable structs
                             >> >>>>>>>> (like Ed said), and I'm trying
                            to fully understand why it's hard.
                             >> >>>>>>>>
                             >> >>>>>>>> So, we can't unpack MutVar into
                            constructors because of object
                             >> >>>>>>>> identity problems. But what
                            about directly supporting an
                             >> >>>>>>>> extensible set of
                             >> >>>>>>>> unlifted MutStruct# objects,
                            generalizing (and even replacing)
                             >> >>>>>>>> MutVar#? That
                             >> >>>>>>>> may be too much work, but is it
                            problematic otherwise?
                             >> >>>>>>>>
                             >> >>>>>>>> Needless to say, this is also
                            critical if we ever want best in
                             >> >>>>>>>> class
                             >> >>>>>>>> lockfree mutable structures,
                            just like their Stm and sequential
                             >> >>>>>>>> counterparts.
                             >> >>>>>>>>
                             >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM
                            Simon Peyton Jones
                             >> >>>>>>>> <simonpj at microsoft.com<mailto:simonpj at microsoft.com>
                            <mailto:simonpj at microsoft.com<mailto:simonpj at microsoft.com>>> wrote:
                             >> >>>>>>>>>
                             >> >>>>>>>>> At the very least I'll take
                            this email and turn it into a short
                             >> >>>>>>>>> article.
                             >> >>>>>>>>>
                             >> >>>>>>>>> Yes, please do make it into a
                            wiki page on the GHC Trac, and
                             >> >>>>>>>>> maybe
                             >> >>>>>>>>> make a ticket for it.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> Thanks
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> Simon
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> From: Edward Kmett
                            [mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>
                            <mailto:ekmett at gmail.com<mailto:ekmett at gmail.com>>]
                             >> >>>>>>>>> Sent: 27 August 2015 16:54
                             >> >>>>>>>>> To: Simon Peyton Jones
                             >> >>>>>>>>> Cc: Manuel M T Chakravarty;
                            Simon Marlow; ghc-devs
                             >> >>>>>>>>> Subject: Re: ArrayArrays
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> An ArrayArray# is just an
                            Array# with a modified invariant. It
                             >> >>>>>>>>> points directly to other
                            unlifted ArrayArray#'s or ByteArray#'s.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> While those live in #, they
                            are garbage collected objects, so
                             >> >>>>>>>>> this
                             >> >>>>>>>>> all lives on the heap.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> They were added to make some
                            of the DPH stuff fast when it has
                             >> >>>>>>>>> to
                             >> >>>>>>>>> deal with nested arrays.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> I'm currently abusing them as
                            a placeholder for a better thing.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> The Problem
                             >> >>>>>>>>>
                             >> >>>>>>>>> -----------------
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> Consider the scenario where
                            you write a classic doubly-linked
                             >> >>>>>>>>> list
                             >> >>>>>>>>> in Haskell.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> data DLL = DLL (IORef (Maybe
                            DLL) (IORef (Maybe DLL)
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> Chasing from one DLL to the
                            next requires following 3 pointers
                             >> >>>>>>>>> on
                             >> >>>>>>>>> the heap.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~>
                            MutVar# RealWorld (Maybe DLL) ~>
                             >> >>>>>>>>> Maybe
                             >> >>>>>>>>> DLL ~> DLL
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> That is 3 levels of indirection.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> We can trim one by simply
                            unpacking the IORef with
                             >> >>>>>>>>> -funbox-strict-fields or UNPACK
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> We can trim another by adding
                            a 'Nil' constructor for DLL and
                             >> >>>>>>>>> worsening our representation.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> data DLL = DLL !(IORef DLL)
                            !(IORef DLL) | Nil
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> but now we're still stuck with
                            a level of indirection
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL
                            ~> DLL
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> This means that every
                            operation we perform on this structure
                             >> >>>>>>>>> will
                             >> >>>>>>>>> be about half of the speed of
                            an implementation in most other
                             >> >>>>>>>>> languages
                             >> >>>>>>>>> assuming we're memory bound on
                            loading things into cache!
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> Making Progress
                             >> >>>>>>>>>
                             >> >>>>>>>>> ----------------------
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> I have been working on a
                            number of data structures where the
                             >> >>>>>>>>> indirection of going from
                            something in * out to an object in #
                             >> >>>>>>>>> which
                             >> >>>>>>>>> contains the real pointer to
                            my target and coming back
                             >> >>>>>>>>> effectively doubles
                             >> >>>>>>>>> my runtime.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> We go out to the MutVar#
                            because we are allowed to put the
                             >> >>>>>>>>> MutVar#
                             >> >>>>>>>>> onto the mutable list when we
                            dirty it. There is a well defined
                             >> >>>>>>>>> write-barrier.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> I could change out the
                            representation to use
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> data DLL = DLL (MutableArray#
                            RealWorld DLL) | Nil
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> I can just store two pointers
                            in the MutableArray# every time,
                             >> >>>>>>>>> but
                             >> >>>>>>>>> this doesn't help _much_
                            directly. It has reduced the amount of
                             >> >>>>>>>>> distinct
                             >> >>>>>>>>> addresses in memory I touch on
                            a walk of the DLL from 3 per
                             >> >>>>>>>>> object to 2.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> I still have to go out to the
                            heap from my DLL and get to the
                             >> >>>>>>>>> array
                             >> >>>>>>>>> object and then chase it to
                            the next DLL and chase that to the
                             >> >>>>>>>>> next array. I
                             >> >>>>>>>>> do get my two pointers
                            together in memory though. I'm paying for
                             >> >>>>>>>>> a card
                             >> >>>>>>>>> marking table as well, which I
                            don't particularly need with just
                             >> >>>>>>>>> two
                             >> >>>>>>>>> pointers, but we can shed that
                            with the "SmallMutableArray#"
                             >> >>>>>>>>> machinery added
                             >> >>>>>>>>> back in 7.10, which is just
                            the old array code a a new data
                             >> >>>>>>>>> type, which can
                             >> >>>>>>>>> speed things up a bit when you
                            don't have very big arrays:
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> data DLL = DLL
                            (SmallMutableArray# RealWorld DLL) | Nil
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> But what if I wanted my object
                            itself to live in # and have two
                             >> >>>>>>>>> mutable fields and be able to
                            share the sme write barrier?
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> An ArrayArray# points directly
                            to other unlifted array types.
                             >> >>>>>>>>> What
                             >> >>>>>>>>> if we have one # -> * wrapper
                            on the outside to deal with the
                             >> >>>>>>>>> impedence
                             >> >>>>>>>>> mismatch between the
                            imperative world and Haskell, and then just
                             >> >>>>>>>>> let the
                             >> >>>>>>>>> ArrayArray#'s hold other
                            arrayarrays.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> data DLL = DLL
                            (MutableArrayArray# RealWorld)
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> now I need to make up a new
                            Nil, which I can just make be a
                             >> >>>>>>>>> special
                             >> >>>>>>>>> MutableArrayArray# I allocate
                            on program startup. I can even
                             >> >>>>>>>>> abuse pattern
                             >> >>>>>>>>> synonyms. Alternately I can
                            exploit the internals further to
                             >> >>>>>>>>> make this
                             >> >>>>>>>>> cheaper.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> Then I can use the
                            readMutableArrayArray# and
                             >> >>>>>>>>> writeMutableArrayArray# calls
                            to directly access the preceding
                             >> >>>>>>>>> and next
                             >> >>>>>>>>> entry in the linked list.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> So now we have one DLL wrapper
                            which just 'bootstraps me' into a
                             >> >>>>>>>>> strict world, and everything
                            there lives in #.
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> next :: DLL -> IO DLL
                             >> >>>>>>>>>
                             >> >>>>>>>>> next (DLL m) = IO $ \s -> case
                            readMutableArrayArray# s of
                             >> >>>>>>>>>
                             >> >>>>>>>>>    (# s', n #) -> (# s', DLL n #)
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>>
                             >> >>>>>>>>> It turns out GHC is quite
                            happy to optimize all of that code to
                             >> >>>>>>>>> keep things unboxed. The 'DLL'
                            wrappers get removed pretty
                             >> >>>>>>>>> easily when they
                             >> >>>>>>>>> are known strict and you chain

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150908/3807ff93/attachment-0001.html>