new i/o library
Bulat Ziganshin
bulatz at HotPOP.com
Sat Jan 28 11:34:05 EST 2006
Hello Simon,
Friday, January 27, 2006, 7:25:44 PM, you wrote:
>> i'm now write some sort of new i/o library. one area where i currently
>> lacks in comparision to the existing Handles implementation in GHC, is
>> the asynchronous i/o operations. can you please briefly describe how
>> this is done in GHC and partially - why the multiple buffers are used?
SM> Multiple buffers were introduced to cope with the semantics we wanted
SM> for hPutStr.
thank you. i was read hPutStr comments, but don't understood that this
problem is the only cause of introducing multiple buffers
SM> The problem is that you don't want hPutStr to hold a lock
SM> on the Handle while it evaluates its argument list, because that could
SM> take arbitrary time. Furthermore, things like this:
SM> putStr (trace "foo" "bar")
SM> used to cause deadlocks, because putStr holds the lock, evaluates its
SM> argument list, which causes trace to also attempt to acquire the lock on
SM> stdout, leading to deadlock.
SM> So, putStr first grabs a buffer from the Handle, then unlocks the Handle
SM> while it fills up the buffer, then it takes the lock again to write the
SM> buffer. Since another thread might try to putStr while the lock is
SM> released, we need multiple buffers.
i don't understand the last sentence. you are said about problems with
performing I/O inside computation of putStr argument, not about
another thread?
i understand that locks basically needed because multiple threads can
try to do i/o with the same Handle simultaneously
SM> For async IO on Unix, we use non-blocking read() calls, and if read()
SM> indicates that we need to block, we send a request to the IO Manager
SM> thread (see GHC.Conc) which calls select() on behalf of all the threads
SM> waiting for I/O. For async IO on Windows, we either use the threaded
SM> RTS's blocking foreign call mechanism to invoke read(), or the
SM> non-threaded RTS has a similar mechanism internally.
so, async I/O in GHC is have nothing common with "zero-wait
operation" in single-threaded environment and can only help to overlap
i/o in one thread with execution of other threads?
SM> We ought to be using the various alternatives to select(), but we
SM> haven't got around to that yet.
yes, i read these threads and even remember Trac ticket about this.
btw, in the typeclasses-based i/o library this facility can be added
as additional middle layer, in the same way as buffering and Char
encoding. i even think that it can be done as 3-party library, w/o any
changes to the main library itself
>> moreover, i have an idea how to implement async i/o without complex
>> burecreacy: use mmapped files, may be together with miltiple buffers.
SM> I don't think we should restrict the implementation to mmap'd files, for
SM> all the reasons that Einar gave. Lots of things aren't mmapable, mainly.
i'm interested because mmap can be used to speed up i/o-bound
programs. but it seems that m/m files can't be used to overlap i/o in
multi-threaded applications. anyway, i use class-based design so at
least we can provide m/m files as one of Stream instances
SM> My vision for an I/O library is this:
SM> - a single class supporting binary input (resp. output) that is
SM> implemented by various transports: files, sockets, mmap'd files,
SM> memory and arrays. Windowed mmap is an option here too.
i don't consider fully-mapped files as an separate instance, because
they can be simulated by using window-mapped files with large window
SM> - layers of binary filters on top of this: you could add buffering,
SM> and compression/decompression.
SM> - a layer of text translation at the top.
SM> This is more or less how the Stream-based I/O library that I was working
SM> on is structured.
SM> The binary I/O library would talk to a binary transport, perhaps with a
SM> layer of buffering, whereas text-based applications talk to the text layer.
that's more or less close to what i do. it is no wonder - i was
substantially influenced by the design of your "new i/o" library. the
only difference is that i use one Stream class for any streams
--
Best regards,
Bulat mailto:bulatz at HotPOP.com
More information about the Glasgow-haskell-users
mailing list