[GHC] #14196: Replace ArrayArray# with either UnliftedArray# or Array#

GHC ghc-devs at haskell.org
Thu Sep 7 23:58:32 UTC 2017


#14196: Replace ArrayArray# with either UnliftedArray# or Array#
-------------------------------------+-------------------------------------
        Reporter:  andrewthad        |                Owner:  (none)
            Type:  feature request   |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.1
      Resolution:                    |             Keywords:
                                     |  LevityPolymorphism
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by andrewthad):

 It looks like I forgot to provide the motivation for this. There are two
 problems with the current `ArrayArray#` interface: (1) Duplicated code and
 (2) lack of expressiveness, which pushes `unsafeCoerce#` onto end users.
 I'll go into both of these.

 Several of the primops for `ArrayArray#` have two variants. For example:

 {{{#!hs
 indexByteArrayArray# :: ArrayArray# -> Int# -> ByteArray#
 indexArrayArrayArray# :: ArrayArray# -> Int# -> ArrayArray#
 }}}

 We see the same thing for the read and write operations on
 `MutableArrayArray#` except that now we've got four variants:

 {{{#!hs
 readByteArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
 (#State# s, ByteArray##)
 readMutableByteArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
 (#State# s, MutableByteArray# s#)
 readArrayArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
 (#State# s, ArrayArray##)
 readMutableArrayArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
 (#State# s, MutableArrayArray# s#)
 }}}

 Under the hood, all four of these have the exact same implementation as we
 can see in
 [https://github.com/ghc/ghc/blob/8ae263ceb3566a7c82336400b09cb8f381217405/compiler/codeGen/StgCmmPrim.hs#L407-L416]:

 {{{#!hs
 emitPrimOp _      [res] ReadArrayArrayOp_ByteArray          [obj,ix]   =
 doReadPtrArrayOp res obj ix
 emitPrimOp _      [res] ReadArrayArrayOp_MutableByteArray   [obj,ix]   =
 doReadPtrArrayOp res obj ix
 emitPrimOp _      [res] ReadArrayArrayOp_ArrayArray         [obj,ix]   =
 doReadPtrArrayOp res obj ix
 emitPrimOp _      [res] ReadArrayArrayOp_MutableArrayArray  [obj,ix]   =
 doReadPtrArrayOp res obj ix
 }}}

 I consider this duplication a minor problem. It's not very costly, and
 it's easy to see what's going on. The real problem is that, despite the
 all the duplication, this interface still only captures a fraction of what
 `ArrayArray#` can really offer. The end user must explicitly use
 `unsafeCoerce#` to do a bunch of things (storing `MVar#`, `Array#`, etc.),
 and even when they want a `ByteArray#` (which the interface does safely
 allow), they must implicitly perform an unsafe coercion on every access
 (read/write/index) because the type system never really tells us what's in
 an `ArrayArray#`. In the `primitive` package, there is a lot of
 `unsafeCoerce#` that is required to make it work:
 [http://hackage.haskell.org/package/primitive-0.6.2.0/docs/src/Data-
 Primitive-UnliftedArray.html#PrimUnlifted].

 Basically, the current interface interface doesn't take advantage of the
 type system. What we have are:

 * implicit unsafe coercion on every access
 * copying functions (`copyArrayArray#`, etc.) that don't ensure that the
 elements in both arrays are of the same type.
 * explicit `unsafeCoerce#` required whenever you want to: store `Array# a`
 inside of `ArrayArray#`, store `MutableByteArray# s` inside of
 `ArrayArray#`, store `MutVar# s a` inside of `ArrayArray#`, etc.

 For the most part, the advantages to having `UnliftedArray# a` are similar
 to the advantages of having `Array#` having the type variable `a`.
 `Array#` doesn't suffer from any of the aforementioned problems. Imagine
 if the interface for `Array#` looked like this:

 {{{#!hs
 data Array#
 data MutableArray# s
 indexArray# :: Array# -> Int -> a
 readMutableArray# :: MutableArray# s -> Int -> a -> State# s -> State# s
 }}}

 This would be terrible, and I'm glad it wasn't done that way.
 `ArrayArray#` was done that way because prior to GHC 8.0, type variables
 could only have lifted runtime representations. But now that we've had
 this for a while, I think it's time to look at cleaning up this interface.
 Plus, in the future, it might be worth having something like `SmallArray#`
 but for unlifted data. I wouldn't want to see the same bulky-but-
 inexpressive interface show up again.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14196#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list