Proposal for a new I/O library design

Mon, 28 Jul 2003 12:06:46 +0100

[ replies to libraries@haskell.org ]

On the whole, I think this is a good direction to explore.  I like the
separation of Files from Streams: indeed it would remove much of the
complication in the existing system caused by having Handles which can
be both read and written.  Also, it gives a nice way to integrate other
objects such as Sockets into the I/O system, which can also have streams
layered on top of them.

I'm concerned about one implementation difficulty.  Your File type is
independent of the filesystem.  That is, on Unix it corresponds to an
inode.  Creating a File must correspond to "opening" it (in Unix speak).
Creating a stream corresponds to duplicating the file descriptor (you
could probably avoid too many unnecessary dups by being clever).
There's a potential implementation difficulty, though:
lookupFileByPathname must open the file, without knowing whether the
file will be used for reading or writing in the future.  So I would
suggest that operations which create a value of type File take a
read/write flag too.

> > type FilePos =3D Word64
> > type BlockLength =3D Int

FilePos should be Integer.

> > fCheckRead  :: File -> FilePos -> BlockLength -> IO Bool
> > fCheckWrite :: File -> FilePos -> BlockLength -> IO Bool

What do these do?  If they're supposed to return True if the required
data can be read/written without blocking, then I suspect that they are
not useful.

> Fundamental operations on streams. "Maybe Octet" is supposed=20
> to represent
> "Octet or EOS," though I'm not sure this is enough for proper EOS
> handling.

I'd use the traditional 'isEOF' way of detecting end of file.

On naming: it's probably not a good idea to use the 'is' prefix, since
it is already used for predicates (meaning literally 'is' rather than an
abbreviation for 'InputStream').

> > isGet      :: InputStream -> IO (Maybe Octet)
> > isPeek     :: InputStream -> IO (Maybe Octet)
> > isGetBlock :: InputStream -> BlockLength -> XXX -> IO BlockLength
> >	-- efficiency hack
> >
> > osPut      :: OutputStream -> Octet -> IO ()
> > osPuts     :: OutputStream -> [Octet] -> IO ()
> > osPutBlock :: OutputStream -> BlockLength -> XXX -> IO ()
> > osFlush    :: OutputStream -> IO ()

You need operations to control buffering, too.  Something like
h{Set,Get}Buffering would be fine.

You will also want a way to get back from an InputStream to the
underlying object, eg. the (File,FilePos) pair if one exists.

It's not pretty, but you certainly want a way to close a stream.
Finalizers aren't reliable enough.

How did you intend text encodings to work?  I see several possibilities:

   textDecode :: TextEncoding -> [Octet] -> [Char]

or
 =20
   decodeInputStream :: TextEncoding -> InputStream -> TextInputStream
   getChar :: TextInputStream -> IO Char
   etc.

or
 =20
   setInputStreamCoding :: InputStream -> TextEncoding -> IO ()
   getChar :: InputStream -> IO Char

The first one is nice, but hard to optimise, and it will get complicated
for encodings which have state.  The second one is probably the best
compromise.

> > data Directory	-- abstract

I don't see a reason for changing the existing Directory support
(System.Directory).  Could you give some motivation here?  Is the idea
to abstract away from the syntax of pathnames on the platform (eg.
directory separator characters)?  If so, I'm not sure it's worthwhile.
There are lots of differences between pathname conventions: case
sensitivity, arbitrary limits on the lengh of filenames, filename
extensions, and so on.

> Convenient shortcuts for common cases.
>=20
> > lookupFileByPathname :: String -> IO File

Here, I suggest we need

  lookupFileByPathname :: FilePath -> IOMode -> IO File

> > lookupInputStreamByPathname :: String -> IO InputStream
> >	-- at least as likely to succeed as lookupFileByPathname

and similarly

  createFileOutputStream :: FilePath -> IO OutputStream
  appendFile :: FilePath -> IO OutputStream

Cheers,
	Simon