[Haskell-cafe] SMP parallelism increasing GC time dramatically
Alexander Kjeldaas
alexander.kjeldaas at gmail.com
Sat Oct 8 02:13:18 CEST 2011
I am guessing that it is slowdown caused by GC needing to co-ordinate with
blocked threads. That requires lots of re-scheduling to happen in the
kernel.
This is a hard problem I think, but also increasingly important as
virtualization becomes more important and the number of schedulable cores
unknown.
Alexander
On 7 October 2011 12:31, Oliver Batchelor <saulzar at gmail.com> wrote:
> I'm not sure if this is at all related, but if I run a small Repa program
> with more threads than I have cores/CPUs then it gets drastically slower, I
> have a dual core laptop - and -N2 makes my small program take approximately
> 0.6 of the time. Increasing to -N4 and we're running about 2x the time, -N8
> and it's taking 20x or more. I guess this is probably more down to the
> design of Repa rather than GHC itself?
>
> Oliver
>
> On Sat, Oct 8, 2011 at 1:21 AM, Tom Thorne <thomas.thorne21 at gmail.com>wrote:
>
>> I have made a dummy program that seems to exhibit the same GC
>> slowdown behavior, minus the segmentation faults. Compiling with -threaded
>> and running with -N12 I get very bad performance (3x slower than -N1),
>> running with -N12 -qg it runs approximately 3 times faster than -N1. I don't
>> know if I should submit this as a bug or not? I'd certainly like to know why
>> this is happening!
>>
>> import Numeric.LinearAlgebra
>> import Numeric.GSL.Special.Gamma
>> import Control.Parallel.Strategies
>> import Control.Monad
>> import Data.IORef
>> import Data.Random
>> import Data.Random.Source.PureMT
>> import Debug.Trace
>> --
>>
>> subsets s n = (subsets_stream s) !! n
>>
>> subsets_stream [] = [[]] : repeat []
>> subsets_stream (x:xs) =
>> let r = subsets_stream xs
>> s = map (map (x:)) r
>> in [[]] : zipWith (++) s (tail r)
>>
>> testfun :: Matrix Double -> Int -> [Int] -> Double
>> testfun x k cs = lngamma (det (c+u))
>> where
>> (m,c) = meanCov xx
>> m' = fromRows [m]
>> u = (trans m') <> m'
>> xx = fromColumns ( [(toColumns x)!!i] ++ [(toColumns x)!!j] ++
>> [(toColumns x)!!k] )
>> i = cs !! 0
>> j = cs !! 1
>>
>>
>> test :: Matrix Double -> Int -> Double
>> test x i = sum p
>> where
>> p = parMap (rdeepseq) (testfun x (i+1)) (subsets [0..i] 2)
>>
>>
>>
>> ranMatrix :: Int -> RVar (Matrix Double)
>> ranMatrix n = do
>> xs <- mapM (\_ -> mapM (\_ -> uniform 0 1.0) [1..n]) [1..n]
>> return (fromLists xs)
>>
>>
>> loop :: Int -> Double -> Int -> RVar Double
>> loop n s i = traceShow i $ do
>> x <- ranMatrix n
>> let r = sum $ parMap (rdeepseq) (test x) [2..(n-2)]
>> return (r+s)
>>
>> main = do
>> let n = 100
>> let iter = 5
>> rng <- newPureMT
>> rngr <- newIORef rng
>> p <- runRVar (foldM (loop n) 0.0 [1..iter]) rngr
>> print p
>>
>> I have also found that the segmentation faults in my code disappear if I
>> switch from Control.Parallel to Control.Monad.Par, which is quite strange. I
>> get slightly better performance with Control.Parallel when it completes
>> without a seg. fault, and the frequency with which it does so seems to
>> depend on the number of sparks that are being created.
>>
>> On Thu, Oct 6, 2011 at 1:56 PM, Tom Thorne <thomas.thorne21 at gmail.com>wrote:
>>
>>> I'm trying to narrow it down so that I can submit a meaningful bug
>>> report, and it seems to be something to do with switching off parallel GC
>>> using -qg, whilst also passing -Nx.
>>>
>>> Are there any known issues with this that people are aware of? At the
>>> moment I am using the latest haskell platform release on arch.
>>>
>>> I'd like to give 7.2 a try in case that fixes it, but I'm using rather a
>>> lot of libraries (hmatrix, fclabels, random fu) and I don't know how to
>>> install them for multiple ghc versions
>>>
>>> On Wed, Oct 5, 2011 at 10:43 PM, Johan Tibell <johan.tibell at gmail.com>wrote:
>>>
>>>> On Wed, Oct 5, 2011 at 2:37 PM, Tom Thorne <thomas.thorne21 at gmail.com>wrote:
>>>>
>>>>> The only problem is that now I am getting random occasional
>>>>> segmentation faults that I was not been getting before, and once got a
>>>>> message saying:
>>>>> Main: schedule: re-entered unsafely
>>>>> Perhaps a 'foreign import unsafe' should be 'safe'?
>>>>> I think this may be something to do with creating a lot of sparks
>>>>> though, since this occurs whether I have the parallel GC on or not.
>>>>>
>>>>
>>>> Unless you (or some library you're using) is doing what the error
>>>> message says then you should file a GHC bug here:
>>>>
>>>> http://hackage.haskell.org/trac/ghc/
>>>>
>>>> -- Johan
>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20111007/6cc6f4a1/attachment.htm>
More information about the Haskell-Cafe
mailing list