[Haskell-cafe] Re: Hugsvs GHC (again)was: Re: Somerandomnewbiequestions

Thu Jan 20 15:52:02 EST 2005

Keean Schupke wrote:

> Why is disk a special case?

With "slow" streams, where there may be an indefinite delay before the
data is available, you can use non-blocking I/O, asynchronous I/O,
select(), poll() etc to determine if the data is available.

If it is, reading the data is essentially just copying from kernel
memory to userspace.

If it isn't, the program can do something else while it's waiting for
the data to arrive.

With files or block devices, the data is always deemed to be
"available", even if the data isn't in physical memory. Calling read()
in such a situation will block until the data has been read into
memory.

> I have never heard that all processes
> under linux wait for a disk read... The kernel most certainly does
> not busy wait for disks to respond, so the only alternative is that
> the process that needs to wait (and only that process) is put to
> sleep. In which case a second thread would be unaffected.

Correct. The point is that maximising CPU utilisation requires the use
of multiple kernel threads; select/poll or non-blocking/asynchronous
I/O won't suffice.

> Linux does not busy wait in the Kernel! (don't forget the kernel
> does read-ahead, so it could be that read really does return
> 'immediately' and without any delay apart from at the end of file -
> In which case asynchronous IO just slows you down with extra context
> switches).

It doesn't busy wait; it suspends the process/thread, then schedules
some other runnable process/thread. The original thread remains
suspended until the data has been transferred into physical memory.

Reading data from a descriptor essentially falls into three cases:

1. The data is in physical RAM. read() copies the data to the supplied
user-space buffer then returns control to the caller.

2. The data isn't in physical RAM, but is available with only a finite
delay (i.e. time taken to read from block device or network
filesystem).

3. The data isn't in physical RAM, and may take an indefinite amount
of time to arrive (e.g. from a socket, pipe, terminal etc).

The central issue is that the Unix API doesn't distinguish between
cases 1 and 2 when it comes to non-blocking I/O, asynchronous I/O,
select/poll etc. [OTOH, NT overlapped I/O and certain Unix extensions
do distinguish these cases, i.e. data is only "available" when it's in
physical RAM.]

If you read from a non-blocking descriptor, and case 2 applies, read()
will block while the data is read from disk then return the data; it
won't return -1 with errno set to EAGAIN, as would happen with case 3. 

If you want to be able to utilise the CPU while waiting for disk I/O
to occur, you have to use multiple kernel threads, with one thread for
each pending I/O operation, plus another one for computations (or
another one for each CPU if you want to obtain the full benefit of
an SMP system).

Even then, you still have to allow for the fact that user-space
"memory" is subject to swapping and demand-paging.

-- 
Glynn Clements <glynn at gclements.plus.com>