new i/o library

Simon Marlow simonmarhaskell at gmail.com
Fri Jan 27 11:25:44 EST 2006


Bulat Ziganshin wrote:

> i'm now write some sort of new i/o library. one area where i currently
> lacks in comparision to the existing Handles implementation in GHC, is
> the asynchronous i/o operations. can you please briefly describe how
> this is done in GHC and partially - why the multiple buffers are used?

Multiple buffers were introduced to cope with the semantics we wanted 
for hPutStr.  The problem is that you don't want hPutStr to hold a lock 
on the Handle while it evaluates its argument list, because that could 
take arbitrary time.  Furthermore, things like this:

   putStr (trace "foo" "bar")

used to cause deadlocks, because putStr holds the lock, evaluates its 
argument list, which causes trace to also attempt to acquire the lock on 
stdout, leading to deadlock.

So, putStr first grabs a buffer from the Handle, then unlocks the Handle 
while it fills up the buffer, then it takes the lock again to write the 
buffer.  Since another thread might try to putStr while the lock is 
released, we need multiple buffers.

For async IO on Unix, we use non-blocking read() calls, and if read() 
indicates that we need to block, we send a request to the IO Manager 
thread (see GHC.Conc) which calls select() on behalf of all the threads 
waiting for I/O.  For async IO on Windows, we either use the threaded 
RTS's blocking foreign call mechanism to invoke read(), or the 
non-threaded RTS has a similar mechanism internally.

We ought to be using the various alternatives to select(), but we 
haven't got around to that yet.

> moreover, i have an idea how to implement async i/o without complex
> burecreacy: use mmapped files, may be together with miltiple buffers.

I don't think we should restrict the implementation to mmap'd files, for 
all the reasons that Einar gave.  Lots of things aren't mmapable, mainly.

My vision for an I/O library is this:

   - a single class supporting binary input (resp. output) that is
     implemented by various transports: files, sockets, mmap'd files,
     memory and arrays.  Windowed mmap is an option here too.

   - layers of binary filters on top of this: you could add buffering,
     and compression/decompression.

   - a layer of text translation at the top.

This is more or less how the Stream-based I/O library that I was working 
on is structured.

The binary I/O library would talk to a binary transport, perhaps with a 
layer of buffering, whereas text-based applications talk to the text layer.

Cheers,
	Simon


More information about the Glasgow-haskell-users mailing list