[GHC] #9246: GHC generates poor code for repeated uses of min/max

GHC ghc-devs at haskell.org
Sun Jun 29 15:57:46 UTC 2014


#9246: GHC generates poor code for repeated uses of min/max
--------------------------------------------+------------------------------
        Reporter:  arotenberg               |            Owner:
            Type:  feature request          |           Status:  new
        Priority:  normal                   |        Milestone:  7.10.1
       Component:  Compiler                 |          Version:  7.8.2
      Resolution:                           |         Keywords:
Operating System:  Windows                  |     Architecture:  x86_64
 Type of failure:  Runtime performance bug  |  (amd64)
       Test Case:                           |       Difficulty:  Unknown
        Blocking:                           |       Blocked By:
                                            |  Related Tickets:  #6135
--------------------------------------------+------------------------------

Comment (by arotenberg):

 I tried using criterion to profile different versions of this function,
 but I couldn't get the results to not be noisy and "microbenchmarky".

 The way I originally found this issue was by profiling the raytracer
 program I'm writing, which fingered this function as both the single
 largest time sink and the largest allocator. I tried replacing the
 definition with this variant:
 {{{
 testRayAABBIntersection :: Ray -> AABB -> Bool
 testRayAABBIntersection
       (Ray (Vec3 (D# ox) (D# oy) (D# oz)) _
         (Vec3 (D# invDx) (D# invDy) (D# invDz)))
       (AABB (Vec3 (D# minX) (D# minY) (D# minZ))
         (Vec3 (D# maxX) (D# maxY) (D# maxZ))) =
     let tx1 = minusTimes## minX ox invDx
         tx2 = minusTimes## maxX ox invDx
         ty1 = minusTimes## minY oy invDy
         ty2 = minusTimes## maxY oy invDy
         tz1 = minusTimes## minZ oz invDz
         tz2 = minusTimes## maxZ oz invDz
         tmin = min## tx1 tx2 `max##` min## ty1 ty2 `max##` min## tz1 tz2
         tmax = max## tx1 tx2 `min##` max## ty1 ty2 `min##` max## tz1 tz2
     in isTrue# (tmax >=## max## 0.0## tmin)

 {-# NOINLINE minusTimes## #-}
 minusTimes## :: Double# -> Double# -> Double# -> Double#
 minusTimes## minA oa invDa = (minA -## oa) *## invDa
 {-# NOINLINE min## #-}
 {-# NOINLINE max## #-}
 min##, max## :: Double# -> Double# -> Double#
 max## x y = if isTrue# (x <=## y) then y else x
 min## x y = if isTrue# (x <=## y) then x else y
 }}}
 This gets the function's heap allocation down to the expected zero bytes,
 but at the cost of actually making the program slower (both with and
 without profiling)!

 I haven't tried comparing results with the LLVM backend yet. I might look
 into it later today.

 It would be nice to see non-branching min/max implemented for the
 primitive numeric types where available. Min/max for integer types can
 probably be implemented efficiently on many platforms using conditional
 move instructions. I'm not particularly interested in implementing it
 myself, but I'm happy with having it as a feature request.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9246#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list