[Haskell-cafe] SMP parallelism increasing GC time dramatically

Tom Thorne thomas.thorne21 at gmail.com
Mon Oct 10 15:55:54 CEST 2011


Yes I will try to run threadscope on it, I tried it before and the event log
output produced about 1.8GB, and then crashed.

Is there any way to tell the RTS to perform GC less often? My code doesn't
use too much memory and I'm using fairly hefty machines (e.g one with 48
cores and 128GB of RAM) and so perhaps the default/heuristic settings aren't
optimal.
On Sun, Oct 9, 2011 at 4:16 PM, Thomas Schilling <nominolo at googlemail.com>wrote:

> It would be really useful to see the threadscope output for this.
> Apart from cache effects (which may well be significant at 12 cores),
> the usual problems with parallel GHC are synchronisation.
>
> When GHC wants to perform a parallel GC it needs to stop all Haskell
> threads.  These are lightweight threads and are scheduled
> cooperatively, i.e., there's no way to interrupt them from the outside
> (except for the OS, but that doesn't help with GC).  Usually, a thread
> is able to yield whenever it tries to do an allocation which is common
> enough in normal Haskell.  However, your work contains lots of matrix
> computation which likely don't do allocations in the inner loop or
> call C to do their work, which isn't interruptible, either.  My guess
> would be that (at least part of) the reason for your slowdown is that
> the parallel GC spends a lot of time waiting for threads to stop.
> This would be apparent in Threadscope.  (I may be wrong, because even
> the single-threaded GC needs to stop all threads)
>
> On 7 October 2011 18:21, Tom Thorne <thomas.thorne21 at gmail.com> wrote:
> > I have made a dummy program that seems to exhibit the same GC
> > slowdown behavior, minus the segmentation faults. Compiling with
> -threaded
> > and running with -N12 I get very bad performance (3x slower than -N1),
> > running with -N12 -qg it runs approximately 3 times faster than -N1. I
> don't
> > know if I should submit this as a bug or not? I'd certainly like to know
> why
> > this is happening!
> > import Numeric.LinearAlgebra
> > import Numeric.GSL.Special.Gamma
> > import Control.Parallel.Strategies
> > import Control.Monad
> > import Data.IORef
> > import Data.Random
> > import Data.Random.Source.PureMT
> > import Debug.Trace
> > --
> > subsets s n = (subsets_stream s) !! n
> > subsets_stream [] = [[]] : repeat []
> > subsets_stream (x:xs) =
> > let r = subsets_stream xs
> >    s = map (map (x:)) r
> > in [[]] : zipWith (++) s (tail r)
> > testfun :: Matrix Double -> Int -> [Int] -> Double
> > testfun x k cs = lngamma (det (c+u))
> > where
> > (m,c) = meanCov xx
> > m' = fromRows [m]
> > u = (trans m') <> m'
> > xx = fromColumns ( [(toColumns x)!!i] ++ [(toColumns x)!!j] ++
> [(toColumns
> > x)!!k] )
> > i = cs !! 0
> > j = cs !! 1
> >
> > test :: Matrix Double -> Int -> Double
> > test x i = sum p
> > where
> > p = parMap (rdeepseq) (testfun x (i+1)) (subsets [0..i] 2)
> >
> >
> > ranMatrix :: Int -> RVar (Matrix Double)
> > ranMatrix n = do
> > xs <- mapM (\_ -> mapM (\_ -> uniform 0 1.0) [1..n]) [1..n]
> > return (fromLists xs)
> >
> > loop :: Int -> Double -> Int -> RVar Double
> > loop n s i = traceShow i $ do
> > x <- ranMatrix n
> > let r = sum $ parMap (rdeepseq) (test x) [2..(n-2)]
> > return (r+s)
> > main = do
> > let n = 100
> > let iter = 5
> > rng <- newPureMT
> > rngr <- newIORef rng
> > p <- runRVar (foldM (loop n) 0.0 [1..iter]) rngr
> > print p
> > I have also found that the segmentation faults in my code disappear if I
> > switch from Control.Parallel to Control.Monad.Par, which is quite
> strange. I
> > get slightly better performance with Control.Parallel when it completes
> > without a seg. fault, and the frequency with which it does so seems to
> > depend on the number of sparks that are being created.
> > On Thu, Oct 6, 2011 at 1:56 PM, Tom Thorne <thomas.thorne21 at gmail.com>
> > wrote:
> >>
> >> I'm trying to narrow it down so that I can submit a meaningful bug
> report,
> >> and it seems to be something to do with switching off parallel GC using
> -qg,
> >> whilst also passing -Nx.
> >> Are there any known issues with this that people are aware of? At the
> >> moment I am using the latest haskell platform release on arch.
> >> I'd like to give 7.2 a try in case that fixes it, but I'm using rather a
> >> lot of libraries (hmatrix, fclabels, random fu) and I don't know how to
> >> install them for multiple ghc versions
> >> On Wed, Oct 5, 2011 at 10:43 PM, Johan Tibell <johan.tibell at gmail.com>
> >> wrote:
> >>>
> >>> On Wed, Oct 5, 2011 at 2:37 PM, Tom Thorne <thomas.thorne21 at gmail.com>
> >>> wrote:
> >>>>
> >>>> The only problem is that now I am getting random occasional
> segmentation
> >>>> faults that I was not been getting before, and once got a message
> saying:
> >>>> Main: schedule: re-entered unsafely
> >>>> Perhaps a 'foreign import unsafe' should be 'safe'?
> >>>> I think this may be something to do with creating a lot of sparks
> >>>> though, since this occurs whether I have the parallel GC on or not.
> >>>
> >>> Unless you (or some library you're using) is doing what the error
> message
> >>> says then you should file a GHC bug here:
> >>>
> >>> http://hackage.haskell.org/trac/ghc/
> >>>
> >>> -- Johan
> >>>
> >>
> >
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> >
>
>
>
> --
> Push the envelope. Watch it bend.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20111010/426166af/attachment.htm>


More information about the Haskell-Cafe mailing list