[Haskell-cafe] Optimizing a high-traffic network architecture

Joel Reymont joelr1 at gmail.com
Thu Dec 15 10:37:36 EST 2005


On Dec 15, 2005, at 2:02 PM, Simon Marlow wrote:
> Hmm, your machine is spending 50% of its time doing nothing, and the
> network traffic is very low.  I wouldn't expect 2k connections to pose
> any problem at all, so further investigation is definitely required.
>
> With 2k connections the overhead of select() is going to start to be a
> problem.  You would notice the system time going up.  -threaded may  
> help
> with this, because it calls select() less often.

I ran two more tests today after making a few changes. The end result  
is that increasing the thread stack space makes the program run  
significantly faster as it was able to launch 1,000 more bots within  
the same hour.

Looking at the end of the 2nd test, 267Mb of physical memory and  
423Mb of VM are something that I will need to really look into. 80%  
CPU utilization by the app is probably a combination of select on 4k  
sockets

The 89 failures are all connections reset by peer, probable cause is  
my wireless LAN.

I'm now using the threaded runtime. Worker threads write to the  
socket. There's one thread monitoring all the timers. Started about  
12:30pm with no thread stack increase and full (very verbose) logging.

It's running 5 OS threads pretty consistently.

Total:  399, Lobby:  398, Failed: 0, 26/81, 10-20%,
Total:  819, Lobby:  810, Failed: 0, 52/119, 20-30%
Total: 1051, Lobby: 1048, Failed: 0, 63/136, 30-50%
Total: 1229, Lobby: 1219, Failed: 0, 74/153, 30-50%
Total: 1318, Lobby: 1299, Failed: 0, 76/157, 30-50%
Total: 1448, Lobby: 1433, Failed: 0, 82/167, 40-60%, 13:06
Total: 1544, Lobby: 1526, Failed: 0, 86/174, 50-60%, 13:13
Total: 1672, Lobby: 1648, Failed: 0, 90/182, 50-60%, 13:23
Total: 1754, Lobby: 1727, Failed: 0, 91/186, 40-60%, 13:31
Total: 1824, Lobby: 1796, Failed: 0, 93/189, 50-60%, 13:40

With reduced logging and +RTS -k3k. Started at 13:42, 4 OS threads.

Total:  367, Lobby:  363, Failed: 0,  24/76, 10%, 13:49
Total:  516, Lobby:  510, Failed: 14, 34/91, 10-20%, 13:52
Total:  841, Lobby:  836, Failed: 17, 49/116, 20% , 13:56
Total: 1450, Lobby: 1434, Failed: 34, 97/181, 20-50-80%, 14:08
Total: 2008, Lobby: 1999, Failed: 35, 133/234, 70-80%, 14:20
Total: 2318, Lobby: 2308, Failed: 35, 154/263, 70-85%, 14:29
Total: 2623, Lobby: 2613, Failed: 35, 174/293, 70-80%, 14:39
Total: 2862, Lobby: 2854, Failed: 35, 191/316, 70-80%, 14:47
Total: 3151, Lobby: 3142, Failed: 40, 214/347, 60-80%, 14:56
Total: 3364, Lobby: 3355, Failed: 40, 219/359, 60-80%, 15:03
Total: 3808, Lobby: 3744, Failed: 89, 247/398, 70-85%, 15:19
Total: 4000, Lobby: 3938, Failed: 89, 267/423, 80%, 15:27

The system has 120+Mb of free physical memory around 3pm but is not  
swapping heavily as the number of page outs is not increasing.  
There's a total of 1Gb of physical memory. 4 OS threads became 5 at  
some point.

--
http://wagerlabs.com/







More information about the Haskell-Cafe mailing list