[Haskell-cafe] Parallel weirdness

Sat Apr 19 11:10:37 EDT 2008

Hello Andrew,

Saturday, April 19, 2008, 6:56:10 PM, you wrote:

> OK, so just for fun, I decided to try implementing a parallel merge sort

coincedence - now i'm writing a parallel compression algorithm, very
much like parallel bzip2, but using ghc, of course

> Weird thing #1: The first time you sort the data, it takes a few
> seconds. The other 7 times, it takes a split second - roughly 100x 
> faster. Wuh?

this looks like disk caching effects. if data are read from disj on
first run and from disk cache on the next runs, this only means that
your algorithm works faster than reading its data from disk

> Weird thing #2: The parallel version runs *faster* than the sequential
> one in all cases - even with SMP disabled! (We're only talking a few 
> percent faster, but still.)

> Weird thing #3: Adding the "-threaded" compiler option makes
> *everything* run a few percent faster. Even with only 1 OS thread.

there are plenty of reasons: first, -threaded make i/o overlapped
with calculations. second, parallel version may exhibit better cpu
cache behavior - such as processing all data in cache before sending
it back to memory

> Weird thing #4: Adding "-N2" makes *everything* slow down a few percent.
> In particular, Task Manager shows only one CPU core in use.

it's easy - your algorithm isn't really parallel, and you just forced
ghc to move it from one core to another. it's overhead of moving data
around :)

> Can anybody explain any of this behaviour? I have no idea what I'm 
> benchmarking, but it certainly doesn't appear to be the performance of a
> parallel merge sort!

there are many subtle effects making optimization much more interesting
than using simple schemas ;)  it's why i like it so much :))

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com