[Haskell-cafe] Parallel weirdness
andrewcoppin at btinternet.com
Sat Apr 19 10:56:10 EDT 2008
OK, so just for fun, I decided to try implementing a parallel merge sort
using the seq and par combinators. My plan was to generate some
psuedo-random data and time how long it takes to sort it. To try to
account for lazy evaluation, what the program actually does is this:
1. Write the input data to disk without any sorting. (This ought to
force it to be fully evaluated.)
2. Sort and save the data to disk 8 times. (So I can average the runtimes.)
This is done with two data sets - one with 1 million items, and another
with 2 million rows. Each data set is run through both the purely
sequential algorithm and a simple parallel one. (Split the list in half,
merge-sort each half in parallel, and then merge them.)
The results of this little benchmark utterly defy comprehension. Allow
me to enumerate:
Weird thing #1: The first time you sort the data, it takes a few
seconds. The other 7 times, it takes a split second - roughly 100x
Weird thing #2: The parallel version runs *faster* than the sequential
one in all cases - even with SMP disabled! (We're only talking a few
percent faster, but still.)
Weird thing #3: Adding the "-threaded" compiler option makes
*everything* run a few percent faster. Even with only 1 OS thread.
Weird thing #4: Adding "-N2" makes *everything* slow down a few percent.
In particular, Task Manager shows only one CPU core in use.
Adding more than 2 OS threads makes everything slow down even further -
but that's hardly surprising.
Can anybody explain any of this behaviour? I have no idea what I'm
benchmarking, but it certainly doesn't appear to be the performance of a
parallel merge sort!
More information about the Haskell-Cafe