[Haskell-cafe] Memory consumption issues under heavy network throughput/concurrency loads

Ben Bangert ben at groovie.org
Tue Jul 15 17:18:24 UTC 2014

I have been testing solutions in several languages for running a network daemon that accepts hundreds of thousands of websocket connections, and moves messages between them. Originally I used this websocket lib, http://jaspervdj.be/websockets/, but upon discovering a rather severe memory leak issue even when sending just a basic ping, switched to Michael Snoyman's first stab at a websocket impl for Yesod here: https://github.com/yesodweb/yesod/commit/66437453f57e6a2747ff7c2199aa7ad25db5904c#diff-dea2b092b8e392f12cc93acb345745e1R58.

When under mild load (5k connections, each pinging once every 10+ seconds), memory usage remained stable and acceptable around 205 MB. Switching to higher load (pinging every second or less), memory usage spiked to 560 MB, and continued to 'leak' slowly. When I quit the server with the profile diagnostics, it indicate that hundreds of MB were "lost due to fragmentation".

In addition, merely opening the connections and dropping them, repeatedly, made the base memory usage go up. Somewhere, memory is not being fully reclaimed when the connections are dropped.

For a few thousand connections doing little, this wouldn't matter much. However, I'm trying to gauge whether its feasible to use Haskell to handle 150-200k connections that regularly come/go and are held open for long periods of time. Such massive memory use with what looks like leaks (or fragmentation issues) is problematic.

I have created a very simple TCP based echo client/server here: https://github.com/bbangert/echo

The server can be run after compiling with no additional options, and will listen on port 8080.
The client can be run like so:
	./dist/build/echoclient/echoclient localhost 8080 2000 0.5

The last number is the frequency to ping (every half second), the second to last is how many clients to connect.

Under my local tests, when pinging every 5+ seconds, 2k clients will take about 50-75 MB of ram. Pinging every 0.5 seconds jumps to ~ 180 MB of ram, and this is for a mere 2k clients. Starting/stopping the echoclient repeatedly also causes the servers overall memory usage to continue to climb higher and higher. Releasing the connections never quite gets the memory usage down to where it started.

The issue seems to occur under both GHC 7.6.3 and 7.8.2.

I know there's a variety of ghc options that might be tuned, am I missing some critical option to keep memory usage under control? Is there a better way to build high TCP throughput/concurrent servers in Haskell that I'm missing?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20140715/0951d3f3/attachment.sig>

More information about the Haskell-Cafe mailing list