[Haskell-cafe] Re: Odd parallel haskell observations (some more numbers)

Jean-Marie Gaillourdet jmg at gaillourdet.net
Mon Aug 9 03:44:00 EDT 2010


Hello,

I am no expert in web server tuning, but I will share my thoughts about your approach and expectations nevertheless.

On 08.08.2010, at 21:07, Alexander Kotelnikov wrote:

> So I continue to issue thousands of HTTP GET requests to a local apache
> an got some ThreadScope pictures and numbers (BTW, I all this happens on
> a 4-core machine).

So your apache configuration is very crucial for the performance figures you will receive from your "benchmark". As far as I know web server benchmarks usually run continuously and report a current throughput or average latency of the last n seconds or something like that. 

This allows the tested web server to adapt to the kind of load it experiences. 
And the benchmarker is able to wait untill those numbers stabilize.
When you execute your program with a different number of capabilities (different -N settings), apache will see a different kind of load and behave different. This makes it hard to change your program and expect similar results.

> I would point out the following as deserving an explanation:
> 1. It looks like that none of tests used resources of more than 2 cores.

This might be an indication that cpu resources are not the limiting factor in this benchmark. You basically bench the io capabilities of your operating system, your installed apache with your configuration and of your installed ghc.

Therefore, increasing available cpu resources does't lead necessarily to increased performance.

> 2. Sometimes there is no activity in any thread of a running program
> (get.N4qg_withgaps.eventlog.png, does this mean that process is in a OS
> queue for scheduling or something else?)


> 3. Without RTS's -c or -qg multithreaded run suffers from excessive GC
> actions.
> 4. Even with -c/-qg thread's run looks to be iterrupted too frequent.
> 5. Provided that 10000 requests in a row can be completed in ~3.4s I
> would expect that 4 threads might come close or even under 1s, but 1.9s
> was the best result.

A last point to consider:

Is getRequest strict? Does it internally use some kind of lazy IO? Is it possible that some resource aren't properly freed? Perhaps, because the library relies on the garbage collector reclaiming sockets? Or because the request aren't completely read?

I simply don't no the internall of Network.HTTP, but if it uses lazy IO it is IMHO not suitable for such a benchmark.

Just, my two euro cents.

-- Jean




More information about the Haskell-Cafe mailing list