new i/o library
Simon Marlow
simonmarhaskell at gmail.com
Fri Jan 27 11:25:44 EST 2006
Bulat Ziganshin wrote:
> i'm now write some sort of new i/o library. one area where i currently
> lacks in comparision to the existing Handles implementation in GHC, is
> the asynchronous i/o operations. can you please briefly describe how
> this is done in GHC and partially - why the multiple buffers are used?
Multiple buffers were introduced to cope with the semantics we wanted
for hPutStr. The problem is that you don't want hPutStr to hold a lock
on the Handle while it evaluates its argument list, because that could
take arbitrary time. Furthermore, things like this:
putStr (trace "foo" "bar")
used to cause deadlocks, because putStr holds the lock, evaluates its
argument list, which causes trace to also attempt to acquire the lock on
stdout, leading to deadlock.
So, putStr first grabs a buffer from the Handle, then unlocks the Handle
while it fills up the buffer, then it takes the lock again to write the
buffer. Since another thread might try to putStr while the lock is
released, we need multiple buffers.
For async IO on Unix, we use non-blocking read() calls, and if read()
indicates that we need to block, we send a request to the IO Manager
thread (see GHC.Conc) which calls select() on behalf of all the threads
waiting for I/O. For async IO on Windows, we either use the threaded
RTS's blocking foreign call mechanism to invoke read(), or the
non-threaded RTS has a similar mechanism internally.
We ought to be using the various alternatives to select(), but we
haven't got around to that yet.
> moreover, i have an idea how to implement async i/o without complex
> burecreacy: use mmapped files, may be together with miltiple buffers.
I don't think we should restrict the implementation to mmap'd files, for
all the reasons that Einar gave. Lots of things aren't mmapable, mainly.
My vision for an I/O library is this:
- a single class supporting binary input (resp. output) that is
implemented by various transports: files, sockets, mmap'd files,
memory and arrays. Windowed mmap is an option here too.
- layers of binary filters on top of this: you could add buffering,
and compression/decompression.
- a layer of text translation at the top.
This is more or less how the Stream-based I/O library that I was working
on is structured.
The binary I/O library would talk to a binary transport, perhaps with a
layer of buffering, whereas text-based applications talk to the text layer.
Cheers,
Simon
More information about the Glasgow-haskell-users
mailing list