[commit: ghc] master: Bitmap: Fix thunk explosion (b29633f)
git at git.haskell.org
git at git.haskell.org
Thu Jul 9 00:07:31 UTC 2015
Repository : ssh://git@git.haskell.org/ghc
On branch : master
Link : http://ghc.haskell.org/trac/ghc/changeset/b29633f5cf310824f3e34716e9261162ced779d3/ghc
>---------------------------------------------------------------
commit b29633f5cf310824f3e34716e9261162ced779d3
Author: Ben Gamari <bgamari.foss at gmail.com>
Date: Thu Jul 9 02:08:01 2015 +0200
Bitmap: Fix thunk explosion
Previously we would build up another `map (-N)` thunk
for every word in the bitmap. Now we strictly accumulate the position
and carry out a single ``map (`subtract` accum)``.
`Bitmap.intsToBitmap` showed up in the profile while compiling a
testcase of #7450 (namely a program containing a record type with large
number of fields which derived `Read`). The culprit was
`CmmBuildInfoTables.procpointSRT.bitmap`. On the testcase (with 4096
fields), the profile previously looked like,
```
total time = 307.94 secs (307943 ticks @ 1000 us, 1
processor)
total alloc = 336,797,868,056 bytes (excludes profiling
overheads)
COST CENTRE MODULE %time %alloc
lintAnnots CoreLint 17.2 25.8
procpointSRT.bitmap CmmBuildInfoTables 11.3 25.2
FloatOutwards SimplCore 7.5 1.6
flatten.lookup CmmBuildInfoTables 4.0 3.9
...
```
After this fix it looks like,
```
total time = 256.88 secs (256876 ticks @ 1000 us, 1
processor)
total alloc = 255,033,667,448 bytes (excludes profiling
overheads)
COST CENTRE MODULE %time %alloc
lintAnnots CoreLint 20.3 34.1
FloatOutwards SimplCore 9.1 2.1
flatten.lookup CmmBuildInfoTables 4.8 5.2
pprNativeCode AsmCodeGen 3.7 4.3
simplLetUnfolding Simplify 3.6 2.2
StgCmm HscMain 3.6 2.1
```
Signed-off-by: Ben Gamari <ben at smart-cactus.org>
Test Plan: Validate
Reviewers: austin, simonpj
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1041
GHC Trac Issues: #7450
>---------------------------------------------------------------
b29633f5cf310824f3e34716e9261162ced779d3
compiler/cmm/Bitmap.hs | 84 ++++++++++++++++++++++++++++++++++++++------------
1 file changed, 64 insertions(+), 20 deletions(-)
diff --git a/compiler/cmm/Bitmap.hs b/compiler/cmm/Bitmap.hs
index e7aa072..22ec6ee 100644
--- a/compiler/cmm/Bitmap.hs
+++ b/compiler/cmm/Bitmap.hs
@@ -1,4 +1,4 @@
-{-# LANGUAGE CPP #-}
+{-# LANGUAGE CPP, BangPatterns #-}
--
-- (c) The University of Glasgow 2003-2006
@@ -45,31 +45,75 @@ chunkToBitmap dflags chunk =
-- eg. @[0,1,3], size 4 ==> 0xb at .
--
-- The list of @Int at s /must/ be already sorted.
-intsToBitmap :: DynFlags -> Int -> [Int] -> Bitmap
-intsToBitmap dflags size slots{- must be sorted -}
- | size <= 0 = []
- | otherwise =
- (foldr (.|.) (toStgWord dflags 0) (map (toStgWord dflags 1 `shiftL`) these)) :
- intsToBitmap dflags (size - wORD_SIZE_IN_BITS dflags)
- (map (\x -> x - wORD_SIZE_IN_BITS dflags) rest)
- where (these,rest) = span (< wORD_SIZE_IN_BITS dflags) slots
+intsToBitmap :: DynFlags
+ -> Int -- ^ size in bits
+ -> [Int] -- ^ sorted indices of ones
+ -> Bitmap
+intsToBitmap dflags size = go 0
+ where
+ word_sz = wORD_SIZE_IN_BITS dflags
+ oneAt :: Int -> StgWord
+ oneAt i = toStgWord dflags 1 `shiftL` i
+
+ -- It is important that we maintain strictness here.
+ -- See Note [Strictness when building Bitmaps].
+ go :: Int -> [Int] -> Bitmap
+ go !pos slots
+ | size <= pos = []
+ | otherwise =
+ (foldr (.|.) (toStgWord dflags 0) (map (\i->oneAt (i - pos)) these)) :
+ go (pos + word_sz) rest
+ where
+ (these,rest) = span (< (pos + word_sz)) slots
-- | Make a bitmap where the slots specified are the /zeros/ in the bitmap.
-- eg. @[0,1,3], size 4 ==> 0x4@ (we leave any bits outside the size as zero,
-- just to make the bitmap easier to read).
--
-- The list of @Int at s /must/ be already sorted and duplicate-free.
-intsToReverseBitmap :: DynFlags -> Int -> [Int] -> Bitmap
-intsToReverseBitmap dflags size slots{- must be sorted -}
- | size <= 0 = []
- | otherwise =
- (foldr xor (toStgWord dflags init) (map (toStgWord dflags 1 `shiftL`) these)) :
- intsToReverseBitmap dflags (size - wORD_SIZE_IN_BITS dflags)
- (map (\x -> x - wORD_SIZE_IN_BITS dflags) rest)
- where (these,rest) = span (< wORD_SIZE_IN_BITS dflags) slots
- init
- | size >= wORD_SIZE_IN_BITS dflags = -1
- | otherwise = (1 `shiftL` size) - 1
+intsToReverseBitmap :: DynFlags
+ -> Int -- ^ size in bits
+ -> [Int] -- ^ sorted indices of zeros free of duplicates
+ -> Bitmap
+intsToReverseBitmap dflags size = go 0
+ where
+ word_sz = wORD_SIZE_IN_BITS dflags
+ oneAt :: Int -> StgWord
+ oneAt i = toStgWord dflags 1 `shiftL` i
+
+ -- It is important that we maintain strictness here.
+ -- See Note [Strictness when building Bitmaps].
+ go :: Int -> [Int] -> Bitmap
+ go !pos slots
+ | size <= pos = []
+ | otherwise =
+ (foldr xor (toStgWord dflags init) (map (\i->oneAt (i - pos)) these)) :
+ go (pos + word_sz) rest
+ where
+ (these,rest) = span (< (pos + word_sz)) slots
+ remain = size - pos
+ init
+ | remain >= word_sz = -1
+ | otherwise = (1 `shiftL` remain) - 1
+
+{-
+
+Note [Strictness when building Bitmaps]
+========================================
+
+One of the places where @Bitmap@ is used is in in building Static Reference
+Tables (SRTs) (in @CmmBuildInfoTables.procpointSRT@). In #7450 it was noticed
+that some test cases (particularly those whose C-- have large numbers of CAFs)
+produced large quantities of allocations from this function.
+
+The source traced back to 'intsToBitmap', which was lazily subtracting the word
+size from the elements of the tail of the @slots@ list and recursively invoking
+itself with the result. This resulted in large numbers of subtraction thunks
+being built up. Here we take care to avoid passing new thunks to the recursive
+call. Instead we pass the unmodified tail along with an explicit position
+accumulator, which get subtracted in the fold when we compute the Word.
+
+-}
{- |
Magic number, must agree with @BITMAP_BITS_SHIFT@ in InfoTables.h.
More information about the ghc-commits
mailing list