[Haskell-cafe] Re[2]: standard poll/select interface

Bulat Ziganshin bulatz at HotPOP.com
Fri Feb 10 09:51:55 EST 2006


Hello Simon,

Friday, February 10, 2006, 3:26:30 PM, you wrote:

>>>>as i understand this idea, transformer implementing async i/o should
>>>>intercept vGetBuf/vPutBuf calls for the FDs, start the appropriate
>>>>
>>>>type FD = Int
>>>>vGetBuf_async :: FD -> Ptr a -> Int -> IO Int
>>>>vPutBuf_async :: FD -> Ptr a -> Int -> IO Int
>> 
>> 
>> EK> Please don't fix FD = Int, this is not true on some systems,
>> EK> and when implementing efficient sockets one usually wants
>> EK> to hold more complex state.
>> 
>> the heart of the library is class Stream. both "File" and "Socket"
>> should implement this interface. just now i use plain "FD" to
>> represent files, but that is temporary solution - really file also
>> must carry additional information: filename, open mode, open/closed
>> state. This "File" will be an abstract datatype, what can be based not
>> on FD in other operating systems.
>> 
>> The same applies to the "Socket". it can be any type what carry enough
>> information to work with network i/o.
>> 
>> implementation of async i/o should have a form of Stream Transformer,
>> which intercepts only the vGetBuf/vPutBuf operations and pass other
>> operations as is:

SM> I don't think async I/O is a stream transformer, fitting it into the 
SM> stream hierarchy seems artificial to me.

yes, it is possible - what i'm trying to implement everything as
tranformer, independent of real necessity. i really thinks that
idea of transformers fit every need in extending functionality

it is a list of my reasons to implement this as transformer:

1) there is no "common FD" interface. module System.FD implements
something, but it is a really interface only for file i/o. it's used
partially in System.MMFile, implementing memory-mapped files, and i
think these fd* operations will be used to partially implement Socket
operations, but something will be different, including using recv/send
instead of read/write to implement GetBuf/PutBuf operations. so, there
is no common "instance Stream FD", but different instances for files,
memory-mapped files and sockets. As Einar just mentioned, Socket
dataype will include information what absent in File datatype. So,
these 3 types have in common using FD to implement some of its
operations, but some operations will be different and internal dataype
structures will be different. Transformer is an ideal way to just
reimplement vGetBuf/vPutBuf operations while passing through all the
rest. Without it, instead of 3 methods of doing I/O (mmap/read/recv)
you will need to implement all the 5
(mmap/read/recv/readAsync/recvAsync) - it's even without counting
selct, epoll and kqueue separately

2) as you can see in epoll()-based implementation of async i/o in
alt-network library, Einar attaches additional data (read/write
queues) to the FD to support epoll() interface. These data will be
different for select, epoll, kqueue and other methods of async i/o. At
least, without async i/o no information should be needed. Transformer
is an ideal way to attach additional data to the file/socket without
changing of "raw" datatype. Again, otherwise you will need to attach
all these data to the raw file, duplicate this work with the raw
socket and then repeat this for select, epoll and other async i/o
methods

on the other side, reasons for your proposal, as i see:

1) if FD will incorporate async i/o support, the System.FD library
will become much more useful - anyone using low-level fd* functions
will get async i/o support for free

but there is another defeciency in the System.FD library - it doesn't
include support for the files>4Gb and files with unicode filenames
under Windows. it seems natural to include this support in fd* too.

now let's see. you are proposing to include in fd* implementation
support for files, sockets, various async i/o methods and what's not
all. are you not think that this library will become a successor of
Handle library, implementing all possible fucntionality and don't
giving 3rd-party libraries chances to change anything partially?

i propose instead to divide library into the small manageable pieces
what can be easily stidied/modified/replaced and that brings something
really usefull only when used together. if what means that low-level
fd* interface can't be used even to work with raw files without great
restrictions (no Unicode filenames in windows, no async i/o) then it
will mean just this.


SM> It is just another way of doing I/O directly to/from file descriptors. 
SM> If your basic operation to read from an FD is

SM>    readFD :: FD -> Int -> Ptr Word8 -> IO Int

SM> then an async I/O layer simply provides you with the exact same 
SM> interface, but with an implementation that doesn't block other threads. 
SM>   It is part of the file descriptor interface, not a stream transformer. 
SM>   Also, you probably need

SM>    readNonBlockingFD :: FD -> Int -> Ptr Word8 -> IO Int
SM>    isReadyFD :: FD -> IO Bool

SM> in fact, I think this should be the basic API, since you can implement 
SM> readFD in terms of it.  (readNonBlockingFD always reads at least one 
SM> byte, blocking until some data is available).  This is used to partially 
SM> fill an input buffer with the available data, for example.

this can be in basic API, but not in basic implementation :))) really,
i think that you mix two things - readNonBlockingFD call that can fill
buffer only partially and readAsync call that use some I/O manager to
perform other Haskell threads while data are read

well, i agree that should be two GetBuf variants in the Stream
interface - greedy and non-greedy. say, vGetBuf and
vGetBufNonBlocking. vPutBuf also need two variants?

then, may be LineBuffering and BlockBuffering should use
vGetBufNonBlocking and vGetBuf, respectively?

but i don't know anything about implementation. is the difference
between readNonBlockingFD and readFD calls only in the O_NONBLOCK mode
of file handle, or different functions are used? what for Windows? for
sockets? how this interacts with the async i/o?

SM> One problem here is that in order to implement readNonBlockingFD on Unix 
SM> you have to put the FD into O_NONBLOCK mode, which due to misdesign of 
SM> the Unix API affects other users of the same file descriptor, including 
SM> other programs.  GHC suffers from this problem.

what means that it is better to decide at "open" stage whether this
file will be used with readNonBlockingFD or with simple readFD?


-- 
Best regards,
 Bulat                            mailto:bulatz at HotPOP.com





More information about the Haskell-Cafe mailing list