[Haskell-beginners] parMap on multicore computers

Fri May 22 18:02:14 EDT 2009

Hi Miguel,

I don't think that you can expect a program to process each line of a 
text file in parallel to very efficient.  Opening a new thread is 
usually fairly cheap, however, there is some bookkeeping involved that 
shouldn't be underestimated.  You only have 2 or 4 cores, so opening 100 
threads or more, depending on the size of your file, will do you no 
good.  You should rather split up your file in 4 chunks and and then 
process these *4* threads in parallel.

That should make it more efficient!  Parallel /= faster.  At least not 
automatically.

Happy Hacking,
Thomas

Miguel Pignatelli wrote:
> Hi all,
>
> I'm experimenting a bit with the parallelization capabilities of Haskell.
> What I am trying to do is to process in parallel all the lines of a 
> text file, calculating the edit distance of each of these lines with a 
> given string.
> This is my testing code:
>
>
> import System.IO
> import Control.Monad
> import Control.Parallel
> import Control.Parallel.Strategies
>
> edist :: String -> String -> Int
> -- edist calculates the edit distance of 2 strings
> -- see for 
> example http://www.csse.monash.edu.au/~lloyd/tildeFP/Haskell/1998/Edit01/ 
> <http://www.csse.monash.edu.au/%7Elloyd/tildeFP/Haskell/1998/Edit01/>
>
> getLines :: FilePath -> IO [Int]
> getLines = liftM ((parMap rnf (edist longString)) . lines) . readFile
>
> main :: IO ()
> main = do
> list <- getLines "input.txt"
> mapM_ ( putStrLn . show ) list
>
> I am testing this code in a 2xQuadCore linux (Ubuntu 8.10) machine (8 
> cores in total).
> The code has been compiled with
>
> ghc --make -threaded mytest.hs
>
> I've been trying input files of different lengths, but the more cores 
> I try to use, the worst performance I am getting.
> Here are some examples:
>
> # input.txt -> 10 lines (strings) of ~1200 letters each
> $ time ./mytest +RTS -N1 > /dev/null 
>
> real 0m4.775s
> user 0m4.700s
> sys 0m0.080s
>
> $ time ./mytest +RTS -N4 > /dev/null
>
> real 0m6.272s
> user 0m8.220s
> sys 0m0.290s
>
> $ time ./mytest +RTS -N8 > /dev/null
>
> real 0m7.090s
> user 0m10.960s
> sys 0m0.400s
>
> # input.txt -> 100 lines (strings) of ~1200 letters each
> $ time ./mytest +RTS -N1 > /dev/null
>
> real 0m49.854s
> user 0m49.730s
> sys 0m0.120s
>
> $ time ./mytest +RTS -N4 > /dev/null
>
> real 1m11.303s
> user 1m36.210s
> sys 0m1.070s
>
> $ time ./mytest +RTS -N8 > /dev/null
>
> real 1m19.488s
> user 2m6.250s
> sys 0m1.270s
>
>
> What is going wrong in this code? Is this a problem of the "grain 
> size" of the parallelization?
> Any help / advice would be very welcome,
>
> M;
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>