Proposal for a new I/O library design
Simon Marlow
libraries@haskell.org
Mon, 28 Jul 2003 12:06:46 +0100
[ replies to libraries@haskell.org ]
On the whole, I think this is a good direction to explore. I like the
separation of Files from Streams: indeed it would remove much of the
complication in the existing system caused by having Handles which can
be both read and written. Also, it gives a nice way to integrate other
objects such as Sockets into the I/O system, which can also have streams
layered on top of them.
I'm concerned about one implementation difficulty. Your File type is
independent of the filesystem. That is, on Unix it corresponds to an
inode. Creating a File must correspond to "opening" it (in Unix speak).
Creating a stream corresponds to duplicating the file descriptor (you
could probably avoid too many unnecessary dups by being clever).
There's a potential implementation difficulty, though:
lookupFileByPathname must open the file, without knowing whether the
file will be used for reading or writing in the future. So I would
suggest that operations which create a value of type File take a
read/write flag too.
> > type FilePos =3D Word64
> > type BlockLength =3D Int
FilePos should be Integer.
> > fCheckRead :: File -> FilePos -> BlockLength -> IO Bool
> > fCheckWrite :: File -> FilePos -> BlockLength -> IO Bool
What do these do? If they're supposed to return True if the required
data can be read/written without blocking, then I suspect that they are
not useful.
> Fundamental operations on streams. "Maybe Octet" is supposed=20
> to represent
> "Octet or EOS," though I'm not sure this is enough for proper EOS
> handling.
I'd use the traditional 'isEOF' way of detecting end of file.
On naming: it's probably not a good idea to use the 'is' prefix, since
it is already used for predicates (meaning literally 'is' rather than an
abbreviation for 'InputStream').
> > isGet :: InputStream -> IO (Maybe Octet)
> > isPeek :: InputStream -> IO (Maybe Octet)
> > isGetBlock :: InputStream -> BlockLength -> XXX -> IO BlockLength
> > -- efficiency hack
> >
> > osPut :: OutputStream -> Octet -> IO ()
> > osPuts :: OutputStream -> [Octet] -> IO ()
> > osPutBlock :: OutputStream -> BlockLength -> XXX -> IO ()
> > osFlush :: OutputStream -> IO ()
You need operations to control buffering, too. Something like
h{Set,Get}Buffering would be fine.
You will also want a way to get back from an InputStream to the
underlying object, eg. the (File,FilePos) pair if one exists.
It's not pretty, but you certainly want a way to close a stream.
Finalizers aren't reliable enough.
How did you intend text encodings to work? I see several possibilities:
textDecode :: TextEncoding -> [Octet] -> [Char]
or
=20
decodeInputStream :: TextEncoding -> InputStream -> TextInputStream
getChar :: TextInputStream -> IO Char
etc.
or
=20
setInputStreamCoding :: InputStream -> TextEncoding -> IO ()
getChar :: InputStream -> IO Char
The first one is nice, but hard to optimise, and it will get complicated
for encodings which have state. The second one is probably the best
compromise.
> > data Directory -- abstract
I don't see a reason for changing the existing Directory support
(System.Directory). Could you give some motivation here? Is the idea
to abstract away from the syntax of pathnames on the platform (eg.
directory separator characters)? If so, I'm not sure it's worthwhile.
There are lots of differences between pathname conventions: case
sensitivity, arbitrary limits on the lengh of filenames, filename
extensions, and so on.
> Convenient shortcuts for common cases.
>=20
> > lookupFileByPathname :: String -> IO File
Here, I suggest we need
lookupFileByPathname :: FilePath -> IOMode -> IO File
> > lookupInputStreamByPathname :: String -> IO InputStream
> > -- at least as likely to succeed as lookupFileByPathname
and similarly
createFileOutputStream :: FilePath -> IO OutputStream
appendFile :: FilePath -> IO OutputStream
Cheers,
Simon