[GHC] #9203: Perf regression in 7.8.2 relative to 7.6.3, possibly related to Typeable

GHC ghc-devs at haskell.org
Fri Jun 13 14:40:34 UTC 2014


#9203: Perf regression in 7.8.2 relative to 7.6.3, possibly related to Typeable
--------------------------------------------+------------------------------
        Reporter:  simonmar                 |            Owner:
            Type:  bug                      |           Status:  new
        Priority:  normal                   |        Milestone:  7.8.3
       Component:  Compiler                 |          Version:  7.8.2
      Resolution:                           |         Keywords:
Operating System:  Unknown/Multiple         |     Architecture:
 Type of failure:  Runtime performance bug  |  Unknown/Multiple
       Test Case:                           |       Difficulty:  Unknown
        Blocking:                           |       Blocked By:
                                            |  Related Tickets:
--------------------------------------------+------------------------------
Changes (by simonmar):

 * cc: tibbe (added)


Comment:

 I've narrowed this down further.  It seems to be something to do with
 `HashMap`.

 With the following source file:

 {{{
 {-# LANGUAGE RankNTypes, GADTs, BangPatterns, DeriveDataTypeable,
     StandaloneDeriving #-}
 {-# OPTIONS_GHC -fno-warn-unused-do-bind -fno-warn-type-defaults #-}

 module Bench where

 import Prelude hiding (mapM)

 import Control.Concurrent
 import Data.Hashable
 import Data.Time.Clock
 import Data.Traversable
 import Data.Typeable
 import System.Environment
 import Text.Printf
 import qualified Data.HashMap.Strict as HashMap
 import Data.HashMap.Strict (HashMap)
 import Unsafe.Coerce

 data TestReq a where
   ReqInt    :: {-# UNPACK #-} !Int -> TestReq Int
   deriving Typeable

 deriving instance Eq (TestReq a)
 deriving instance Show (TestReq a)

 instance Hashable (TestReq a) where
   hashWithSalt salt (ReqInt i) = hashWithSalt salt (0::Int, i)

 main = do
   [n] <- fmap (fmap read) getArgs
   t0 <- getCurrentTime
   let
      f 0  !cache = cache
      f !n !cache = f (n-1) (dcinsert (ReqInt n) 0 cache)
   --
   let !cache = f n dcempty
   let m = dclookup (ReqInt (n `div` 2)) cache
   print m
   t1 <- getCurrentTime
   printf "insert: %.2fs\n" (realToFrac (t1 `diffUTCTime` t0) :: Double)

   t0 <- getCurrentTime
   let
      f 0  !m = m
      f !n !m = case dclookup (ReqInt n) cache of
                  Nothing -> f (n-1) m
                  Just _  -> f (n-1) (m+1)
   print (f n 0)
   t1 <- getCurrentTime
   printf "lookup: %.2fs\n" (realToFrac (t1 `diffUTCTime` t0) :: Double)


 newtype DataCache = DataCache (HashMap TypeRep SubCache)

 -- | The implementation is a two-level map: the outer level maps the
 -- types of requests to 'SubCache', which maps actual requests to their
 -- results.  So each 'SubCache' contains requests of the same type.
 -- This works well because we only have to store the dictionaries for
 -- 'Hashable' and 'Eq' once per request type.
 data SubCache =
   forall req a . (Hashable (req a), Eq (req a)) =>
        SubCache ! (HashMap (req a) a)
        -- NB. the inner HashMap is strict, to avoid building up
        -- a chain of thunks during repeated insertions.

 -- | A new, empty 'DataCache'.
 dcempty :: DataCache
 dcempty = DataCache HashMap.empty

 -- | Inserts a request-result pair into the 'DataCache'.
 dcinsert
   :: (Hashable (r a), Typeable (r a), Eq (r a))
   => r a
   -- ^ Request
   -> a
   -- ^ Result
   -> DataCache
   -> DataCache

 dcinsert req result (DataCache m) =
       DataCache $
         HashMap.insertWith fn (typeOf req)
                               (SubCache (HashMap.singleton req result)) m
   where
     fn (SubCache new) (SubCache old) =
         SubCache (unsafeCoerce new `HashMap.union` old)

 -- | Looks up the cached result of a request.
 dclookup
   :: Typeable (r a)
   => r a
   -- ^ Request
   -> DataCache
   -> Maybe a

 dclookup req (DataCache m) =
       case HashMap.lookup (typeOf req) m of
         Nothing -> Nothing
         Just (SubCache sc) ->
            unsafeCoerce (HashMap.lookup (unsafeCoerce req) sc)
 }}}

 GHC 7.6.3:

 {{{
 Just 0
 insert: 0.73s
 500000
 lookup: 0.23s
 }}}

 GHC 7.8.2:

 {{{
 Just 0
 insert: 1.01s
 500000
 lookup: 0.53s
 }}}

 `insert` is a bit slower, but `lookup` is more than twice as slow with
 7.8.2.

 Looking at the Core, at lookup in particular, the code in 7.8.2 looks
 reasonable.  But in both cases we end up calling
 `Data.HashMap.Base.lookup` for the inner lookup, and I'm guessing that is
 where the inefficiency lies.

 @tibbe, want to take a look?

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9203#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list