[GHC] #14196: Replace ArrayArray# with either UnliftedArray# or Array#
GHC
ghc-devs at haskell.org
Thu Sep 7 23:58:32 UTC 2017
#14196: Replace ArrayArray# with either UnliftedArray# or Array#
-------------------------------------+-------------------------------------
Reporter: andrewthad | Owner: (none)
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.1
Resolution: | Keywords:
| LevityPolymorphism
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by andrewthad):
It looks like I forgot to provide the motivation for this. There are two
problems with the current `ArrayArray#` interface: (1) Duplicated code and
(2) lack of expressiveness, which pushes `unsafeCoerce#` onto end users.
I'll go into both of these.
Several of the primops for `ArrayArray#` have two variants. For example:
{{{#!hs
indexByteArrayArray# :: ArrayArray# -> Int# -> ByteArray#
indexArrayArrayArray# :: ArrayArray# -> Int# -> ArrayArray#
}}}
We see the same thing for the read and write operations on
`MutableArrayArray#` except that now we've got four variants:
{{{#!hs
readByteArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
(#State# s, ByteArray##)
readMutableByteArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
(#State# s, MutableByteArray# s#)
readArrayArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
(#State# s, ArrayArray##)
readMutableArrayArrayArray# :: MutableArrayArray# s -> Int# -> State# s ->
(#State# s, MutableArrayArray# s#)
}}}
Under the hood, all four of these have the exact same implementation as we
can see in
[https://github.com/ghc/ghc/blob/8ae263ceb3566a7c82336400b09cb8f381217405/compiler/codeGen/StgCmmPrim.hs#L407-L416]:
{{{#!hs
emitPrimOp _ [res] ReadArrayArrayOp_ByteArray [obj,ix] =
doReadPtrArrayOp res obj ix
emitPrimOp _ [res] ReadArrayArrayOp_MutableByteArray [obj,ix] =
doReadPtrArrayOp res obj ix
emitPrimOp _ [res] ReadArrayArrayOp_ArrayArray [obj,ix] =
doReadPtrArrayOp res obj ix
emitPrimOp _ [res] ReadArrayArrayOp_MutableArrayArray [obj,ix] =
doReadPtrArrayOp res obj ix
}}}
I consider this duplication a minor problem. It's not very costly, and
it's easy to see what's going on. The real problem is that, despite the
all the duplication, this interface still only captures a fraction of what
`ArrayArray#` can really offer. The end user must explicitly use
`unsafeCoerce#` to do a bunch of things (storing `MVar#`, `Array#`, etc.),
and even when they want a `ByteArray#` (which the interface does safely
allow), they must implicitly perform an unsafe coercion on every access
(read/write/index) because the type system never really tells us what's in
an `ArrayArray#`. In the `primitive` package, there is a lot of
`unsafeCoerce#` that is required to make it work:
[http://hackage.haskell.org/package/primitive-0.6.2.0/docs/src/Data-
Primitive-UnliftedArray.html#PrimUnlifted].
Basically, the current interface interface doesn't take advantage of the
type system. What we have are:
* implicit unsafe coercion on every access
* copying functions (`copyArrayArray#`, etc.) that don't ensure that the
elements in both arrays are of the same type.
* explicit `unsafeCoerce#` required whenever you want to: store `Array# a`
inside of `ArrayArray#`, store `MutableByteArray# s` inside of
`ArrayArray#`, store `MutVar# s a` inside of `ArrayArray#`, etc.
For the most part, the advantages to having `UnliftedArray# a` are similar
to the advantages of having `Array#` having the type variable `a`.
`Array#` doesn't suffer from any of the aforementioned problems. Imagine
if the interface for `Array#` looked like this:
{{{#!hs
data Array#
data MutableArray# s
indexArray# :: Array# -> Int -> a
readMutableArray# :: MutableArray# s -> Int -> a -> State# s -> State# s
}}}
This would be terrible, and I'm glad it wasn't done that way.
`ArrayArray#` was done that way because prior to GHC 8.0, type variables
could only have lifted runtime representations. But now that we've had
this for a while, I think it's time to look at cleaning up this interface.
Plus, in the future, it might be worth having something like `SmallArray#`
but for unlifted data. I wouldn't want to see the same bulky-but-
inexpressive interface show up again.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14196#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list