Network/Notwork?

Claus Reinke claus.reinke@talk21.com
Sun, 16 Mar 2003 23:59:28 -0000


> Happy with my Winsock work-arounds for my small client-/server-test,
> I decided to try integrating the Network use into my target project, and
> got nothing but trouble. Again, things that work happily under Unix
> simply fail under windows.   [...]

Ok, sometimes I am a bit slow.. 
In my defense: misleading error messages can be a bit, well, misleading.

But first, for those who run into this via a search engine: the main error
turned out to be in my programming, not in the Network library, which 
works just fine (modulo the previously discussed fixes, and a couple of
smaller issues). I'll explain below in case you see the same effects.

Second, there are some smallish bugs in the Network and IO libraries.

    - in IO operations working on the result of socketToHandle, when 
      sockets are implemented using Winsock, error messages should consult
      WSAGetLastError (in my case, I kept getting "doesn't exist"-errors,
      where the real problem was WSAECONNRESET 10054 "Connection 
      reset by peer."). Note this is a separate issue from the "proper"
      socket operations consulting WSAGetLastError.

    - in Network.Socket: SocketOptions may include Linger, but 
      setSocketOptions uses packSocketOptions, which doesn't know
      about Linger? And shouldn't Linger be on by default for the 
      high-level Network lib?

Back to my little problem: what happened was that, in moving from
the test to the integration, I had made a little change (..), namely to move
from a single accept and a single loop in the server to several accepts
with separate loops. In fact, the motivation for all this is that I wanted
to have frequent calls to very lightweight and shortlived clients, using
the portable Network library to avoid having to use completely different 
mechanisms on the platforms we want to support.

Experienced Networkers will guess what happened: the clients sent
their request and terminated (the backchannel from the server to the 
application that calls the clients has to be implemented using another, 
though equally portable, mechanism). The server either read the request 
before the client closed the connection - or it didn't, resulting in a 
"connection reset by peer"-error when the server finally got around to 
try and read a request, and that error masqueraded as some different 
error (e.g., "doesn't exist") due to Winsock's error handling.

Workarounds: 
    - using hFlush does *not* seem to cure the problem??
    - using setSockOptions with Linger to control lingering of sockets with 
      pending data in response to close operations might be the "proper" way?
    - letting the short-lived connection partner wait for an acknowledgement
      from the longer-lived one helps me out in the meantime.

Sorry about the confusion. Once the fixes make it to the released versions,
the Network library is a nice asset. Perhaps a minimal tutorial, with a list of
traps to avoid, would be helpful - certainly more interesting than the old
quicksort example on haskell.org;-)

Claus

"experience is what you don't have until just after you need it" (source unknown)