How is non-blocking IO working ?
Alastair Reid
alastair@reid-consulting-uk.ltd.uk
Thu, 03 Apr 2003 18:05:22 +0100
Ahn Ki-yung <kyagrd@hitel.net> writes:
> Document says GHC uses non-blocking IO to be thread friendly.
> [...]
> How is non-blocking and thread working ? I can't understand.
This is usually implemented using 'select' (see attached man page).
In a multithreaded program where you have user-level threads (which
is, effectively, what GHC's threads provide), you typically maintain a
global set of all file descriptors that a thread is trying to read
from. There are then two cases:
1) If there are no runnable threads, call 'select' on all threads
using a NULL timeout.
2) If there are runnable threads, you can use 'select' with a timeout
of 0 to poll for readable file descriptors every N seconds. The
number N should be chosen low enough to have low latency and high
enough to have low overhead.
An alternative that works on some operating systems is to use
asynchronous I/O calls. The problem is that this isn't as portable as
using select.
Hope this helps.
--
Alastair Reid alastair@reid-consulting-uk.ltd.uk
Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/
SELECT(2) Linux Programmer's Manual SELECT(2)
NAME
select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
multiplexing
SYNOPSIS
/* According to POSIX 1003.1-2001 */
#include <sys/select.h>
/* According to earlier standards */
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
struct timeval *timeout);
int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set
*exceptfds, const struct timespec *timeout, const sigset_t *sigmask);
FD_CLR(int fd, fd_set *set);
FD_ISSET(int fd, fd_set *set);
FD_SET(int fd, fd_set *set);
FD_ZERO(fd_set *set);
DESCRIPTION
The functions select and pselect wait for a number of file descriptors
to change status.
Their function is identical, with three differences:
(i) The select function uses a timeout that is a struct timeval
(with seconds and microseconds), while pselect uses a struct
timespec (with seconds and nanoseconds).
(ii) The select function may update the timeout parameter to indicate
how much time was left. The pselect function does not change
this parameter.
(iii) The select function has no sigmask parameter, and behaves as
pselect called with NULL sigmask.
Three independent sets of descriptors are watched. Those listed in
readfds will be watched to see if characters become available for read-
ing (more precisely, to see if a read will not block - in particular, a
file descriptor is also ready on end-of-file), those in writefds will
be watched to see if a write will not block, and those in exceptfds
will be watched for exceptions. On exit, the sets are modified in
place to indicate which descriptors actually changed status.
Four macros are provided to manipulate the sets. FD_ZERO will clear a
set. FD_SET and FD_CLR add or remove a given descriptor from a set.
FD_ISSET tests to see if a descriptor is part of the set; this is use-
ful after select returns.
n is the highest-numbered descriptor in any of the three sets, plus 1.
timeout is an upper bound on the amount of time elapsed before select
returns. It may be zero, causing select to return immediately. (This is
useful for polling.) If timeout is NULL (no timeout), select can block
indefinitely.
sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is
not NULL, then pselect first replaces the current signal mask by the
one pointed to by sigmask, then does the `select' function, and then
restores the original signal mask again.
The idea of pselect is that if one wants to wait for an event, either a
signal or something on a file descriptor, an atomic test is needed to
prevent race conditions. (Suppose the signal handler sets a global flag
and returns. Then a test of this global flag followed by a call of
select() could hang indefinitely if the signal arrived just after the
test but just before the call. On the other hand, pselect allows one to
first block signals, handle the signals that have come in, then call
pselect() with the desired sigmask, avoiding the race.) Since Linux
today does not have a pselect() system call, the current glibc2 routine
still contains this race.
The timeout
The time structures involved are defined in <sys/time.h> and look like
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};
and
struct timespec {
long tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
Some code calls select with all three sets empty, n zero, and a non-
null timeout as a fairly portable way to sleep with subsecond preci-
sion.
On Linux, the function select modifies timeout to reflect the amount of
time not slept; most other implementations do not do this. This causes
problems both when Linux code which reads timeout is ported to other
operating systems, and when code is ported to Linux that reuses a
struct timeval for multiple selects in a loop without reinitializing
it. Consider timeout to be undefined after select returns.
RETURN VALUE
On success, select and pselect return the number of descriptors con-
tained in the descriptor sets, which may be zero if the timeout expires
before anything interesting happens. On error, -1 is returned, and
errno is set appropriately; the sets and timeout become undefined, so
do not rely on their contents after an error.
ERRORS
EBADF An invalid file descriptor was given in one of the sets.
EINTR A non blocked signal was caught.
EINVAL n is negative.
ENOMEM select was unable to allocate memory for internal tables.
EXAMPLE
#include <stdio.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
int
main(void) {
fd_set rfds;
struct timeval tv;
int retval;
/* Watch stdin (fd 0) to see when it has input. */
FD_ZERO(&rfds);
FD_SET(0, &rfds);
/* Wait up to five seconds. */
tv.tv_sec = 5;
tv.tv_usec = 0;
retval = select(1, &rfds, NULL, NULL, &tv);
/* Don't rely on the value of tv now! */
if (retval)
printf("Data is available now.\n");
/* FD_ISSET(0, &rfds) will be true. */
else
printf("No data within five seconds.\n");
return 0;
}
CONFORMING TO
4.4BSD (the select function first appeared in 4.2BSD). Generally
portable to/from non-BSD systems supporting clones of the BSD socket
layer (including System V variants). However, note that the System V
variant typically sets the timeout variable before exit, but the BSD
variant does not.
The pselect function is defined in IEEE Std 1003.1g-2000 (POSIX.1g),
and part of POSIX 1003.1-2001. It is found in glibc2.1 and later.
Glibc2.0 has a function with this name, that however does not take a
sigmask parameter.
NOTES
Concerning prototypes, the classical situation is that one should
include <time.h> for select. The POSIX 1003.1-2001 situation is that
one should include <sys/select.h> for select and pselect. Libc4 and
libc5 do not have a <sys/select.h> header; under glibc 2.0 and later
this header exists. Under glibc 2.0 it unconditionally gives the wrong
prototype for pselect, under glibc 2.1-2.2.1 it gives pselect when
_GNU_SOURCE is defined, under glibc 2.2.2-2.2.4 it gives it when
_XOPEN_SOURCE is defined and has a value of 600 or larger. No doubt,
since POSIX 1003.1-2001, it should give the prototype by default.
SEE ALSO
For a tutorial with discussion and examples, see select_tut(2).
For vaguely related stuff, see accept(2), connect(2), poll(2), read(2),
recv(2), send(2), sigprocmask(2), write(2)
Linux 2.4 2001-02-09 SELECT(2)