[Haskell-cafe] Optimizing a high-traffic network architecture

Joel Reymont joelr1 at gmail.com
Thu Dec 15 05:21:09 EST 2005


Here are statistics that I gathered. I'm almost done modifying the  
program to use 1 timer thread instead of 1 per bot as well as writing  
to the socket from the writer thread. This should reduce the number  
of threads from 6k (2k x 3) to 2k plus change.

It appears that +RTS -k3k does make a difference. As per Simon, 2-4k  
avoids the thread being garbage collected because each thread gets  
its own block in the storage manager. Simon, did I get that right?

BTW, how does garbage-collecting a thread works in this scenario? My  
threads are very long-running.

The total is the number of bots launched, lobby is how many bots  
connected to the lobby. Failed is mostly due to connection reset by  
peer errors. The Windows C++ server uses IOCP and running a firewall  
was apparently interfering with that somehow. I hate Windows :-(.

--- Test#1 +RTS -k3k as per Simon. Keep-alive timeout of 9 minutes.

Total:   1961, Lobby:   1961, Failed:  0
Total:   2000, Lobby:   2000, Failed:  1

This test went smoothly and got to 2k connections very quickly. Maybe  
within 30 minutes or so. I did not gather CPU usage, etc. statistics.

--- Test #2, No thread stack increase, 1 minute keep-alive timeout,  
more network traffic

With a 1 minute timeout things run veeery slow. 86 physical and 158Mb  
of VM with 1k bots, CPU 50-60%. Data sent/received is 60-70 packets  
and 6-7kb/sec. Killed after a while.

The statistics are phys/VM, CPU usage in % and #packets/transfer speed

Total:   1345, Lobby:   1326, Failed:  0, 102/184, 50%, 90/8kb
Total:   1395, Lobby:   1367, Failed:  2
Total:   1421, Lobby:   1394, Failed:  4
Total:   1490, Lobby:   1463, Failed:  4, 108/194, 50%, 110/11Kb
Total:   1574, Lobby:   1546, Failed:  4, 113/202, 50%, 116/11kb

--- Test #3, Rebuilding app with basic logging only (level 10). Stil  
veeery slow. Started ~6pm

Total:   121, Lobby:   118, Failed:  1
Total:   521, Lobby:   509, Failed:  13, 46/104, 20-30%, 35/3kb
Total:   1055, Lobby:   1044, Failed:  13, 94/168, 50%
Total:   1325, Lobby:   1313, Failed:  13
Total:   1566, Lobby:   1553, Failed:  13, 126/215, 70-80%,
Total:   1692, Lobby:   1680, Failed:  13, 136/228, 80%
Total:   1728, Lobby:   1715, Failed:  13, 140/234, 85%
Total:   1746, Lobby:   1733, Failed:  13, 140/235, 50-85%, 6:39pm
Total:   1818, Lobby:   1805, Failed:  13, 145/240, 60-85%,
Total:   1896, Lobby:   1883, Failed:  13, 153/250, 60-85%, 7:01pm
Total:   1933, Lobby:   1919, Failed:  13, 155/255, 70-85%, 7:12pm

System has 216Mb of spare physical memory at this point but the app  
seems to spend most of the time collecting garbage.

Total:   1999, Lobby:   1986, Failed:  13, 162/262, 65-86%, 7:41pm

--
http://wagerlabs.com/







More information about the Haskell-Cafe mailing list