[Haskell-cafe] Optimizing a high-traffic network architecture
Joel Reymont
joelr1 at gmail.com
Thu Dec 15 10:37:36 EST 2005
On Dec 15, 2005, at 2:02 PM, Simon Marlow wrote:
> Hmm, your machine is spending 50% of its time doing nothing, and the
> network traffic is very low. I wouldn't expect 2k connections to pose
> any problem at all, so further investigation is definitely required.
>
> With 2k connections the overhead of select() is going to start to be a
> problem. You would notice the system time going up. -threaded may
> help
> with this, because it calls select() less often.
I ran two more tests today after making a few changes. The end result
is that increasing the thread stack space makes the program run
significantly faster as it was able to launch 1,000 more bots within
the same hour.
Looking at the end of the 2nd test, 267Mb of physical memory and
423Mb of VM are something that I will need to really look into. 80%
CPU utilization by the app is probably a combination of select on 4k
sockets
The 89 failures are all connections reset by peer, probable cause is
my wireless LAN.
I'm now using the threaded runtime. Worker threads write to the
socket. There's one thread monitoring all the timers. Started about
12:30pm with no thread stack increase and full (very verbose) logging.
It's running 5 OS threads pretty consistently.
Total: 399, Lobby: 398, Failed: 0, 26/81, 10-20%,
Total: 819, Lobby: 810, Failed: 0, 52/119, 20-30%
Total: 1051, Lobby: 1048, Failed: 0, 63/136, 30-50%
Total: 1229, Lobby: 1219, Failed: 0, 74/153, 30-50%
Total: 1318, Lobby: 1299, Failed: 0, 76/157, 30-50%
Total: 1448, Lobby: 1433, Failed: 0, 82/167, 40-60%, 13:06
Total: 1544, Lobby: 1526, Failed: 0, 86/174, 50-60%, 13:13
Total: 1672, Lobby: 1648, Failed: 0, 90/182, 50-60%, 13:23
Total: 1754, Lobby: 1727, Failed: 0, 91/186, 40-60%, 13:31
Total: 1824, Lobby: 1796, Failed: 0, 93/189, 50-60%, 13:40
With reduced logging and +RTS -k3k. Started at 13:42, 4 OS threads.
Total: 367, Lobby: 363, Failed: 0, 24/76, 10%, 13:49
Total: 516, Lobby: 510, Failed: 14, 34/91, 10-20%, 13:52
Total: 841, Lobby: 836, Failed: 17, 49/116, 20% , 13:56
Total: 1450, Lobby: 1434, Failed: 34, 97/181, 20-50-80%, 14:08
Total: 2008, Lobby: 1999, Failed: 35, 133/234, 70-80%, 14:20
Total: 2318, Lobby: 2308, Failed: 35, 154/263, 70-85%, 14:29
Total: 2623, Lobby: 2613, Failed: 35, 174/293, 70-80%, 14:39
Total: 2862, Lobby: 2854, Failed: 35, 191/316, 70-80%, 14:47
Total: 3151, Lobby: 3142, Failed: 40, 214/347, 60-80%, 14:56
Total: 3364, Lobby: 3355, Failed: 40, 219/359, 60-80%, 15:03
Total: 3808, Lobby: 3744, Failed: 89, 247/398, 70-85%, 15:19
Total: 4000, Lobby: 3938, Failed: 89, 267/423, 80%, 15:27
The system has 120+Mb of free physical memory around 3pm but is not
swapping heavily as the number of page outs is not increasing.
There's a total of 1Gb of physical memory. 4 OS threads became 5 at
some point.
--
http://wagerlabs.com/
More information about the Haskell-Cafe
mailing list