[Haskell-cafe] announce: Glome.hs-0.3 (Haskell raytracer)
David Roundy
droundy at darcs.net
Fri Apr 18 16:32:29 EDT 2008
On Sat, Apr 19, 2008 at 12:19:19AM +0400, Bulat Ziganshin wrote:
> Saturday, April 19, 2008, 12:10:23 AM, you wrote:
> > The other problem I had with concurrency is that I was getting about a
> > 50% speedup instead of the 99% or so that I'd expect on two cores. I
>
> 2 cores doesn't guarantee 2x speedup. some programs are limited by
> memory access speed and you still have just one memory :)
In fact, this is relatively easily tested (albeit crudely): just run two
copies of your single-threaded program at the same time. If they take
longer than when run one at a time, you can guess that you're
memory-limited, and you won't get such good performance from threading your
code. But this is only a crude hint, since memory performance is strongly
dependent on cache behavior, and running one threaded job may either do
better or worse than two single-threaded jobs. If you've got two separate CPUs
with two separate caches, the simultaneous single-threaded jobs should beat the
threaded job (meaning take less than twice as long), since each job should
have full access to one cache. If you've got two cores sharing a single
cache, the behavior may be the opposite: the threaded job uses less total
memory than the two single-threaded jobs, so more of the data may stay in
cache.
For reference, on a friend's dual quad-core Intel system (i.e. 8 cores
total), if he runs 8 simultaneous (identical) memory-intensive job he only
gets about five times the throughput of a job, meaning that each core is
running at something like 60% of it's CPU capacity due to memory
contention. It's possible that your system is comparably limited, although
I'd be suprised, somehow it seems unlikely that your ray tracer is
stressing the cache all that much.
--
David Roundy
Department of Physics
Oregon State University
More information about the Haskell-Cafe
mailing list