[GHC] #10652: Better cache performance in Array#

Mon Jul 20 00:14:56 UTC 2015

#10652: Better cache performance in Array#
-------------------------------------+-------------------------------------
        Reporter:  MikeIzbicki       |                   Owner:
            Type:  feature request   |                  Status:  new
        Priority:  normal            |               Milestone:
       Component:  Compiler          |                 Version:  7.10.1
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |            Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |               Test Case:
      Blocked By:                    |                Blocking:
 Related Tickets:                    |  Differential Revisions:
-------------------------------------+-------------------------------------

Comment (by rwbarton):

 If the values pointed to by your array don't themselves contain pointers
 (and assuming nothing else on the heap points to them) then either a BFS
 or a DFS traversal will visit them all sequentially. But in many cases if
 your values don't contain pointers then you could find a way to store them
 in an unboxed array with higher memory density anyways. (Using a boxed
 array costs you two words per element up front, one for the pointer in the
 array and one for the info pointer of the

 If your values do themselves contain pointers then it does matter whether
 the traversal is BFS or DFS, but which one you want depends on how you
 plan to use the values, which the GC can't know without further hints. For
 example a large Integer contains a pointer to a ByteArray# containing the
 actual integer data, which you will need for most (but not all) operations
 on the Integer. If you have an array with a lot of large Integers, you may
 or may not want to put the ByteArray# after its corresponding Integer
 (even ignoring the possibility of the ByteArray#s being shared).

 My potential concerns about the original feature request are

 - It sounds like a lot of extra complexity in the GC
 - Existing programs gain nothing from this added complexity unless they
 are modified to use the new primops
 - Consequently the implementation of the new primops must have zero cost
 for programs that do not use them
 - A fairly narrow range of programs stand to benefit from the new primops
 - It's unclear how much there is to be gained in the best case, especially
 compared to user-space alternatives like explicitly prefetching the next
 value of the array before processing the current one

 You might be better off just implementing your own memory management (e.g.
 allocate variable-length representations of your data sequentially from a
 mutable ByteArray, then store offsets into this storage in your array and
 access the data through short-lived Haskell values, Storable-style).

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10652#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler