Raw I/O library proposal, second (more pragmatic) draft

Sun, 3 Aug 2003 22:51:39 -0700

In article 
<3429668D0E777A499EE74A7952C382D19D1238@EUR-MSG-01.europe.corp.microsoft
.com>,
 "Simon Marlow" <simonmar@microsoft.com> wrote:

> I wanted to float a generalisation of this scheme, though.  I'm
> wondering whether it might be a good idea to make InputStream and
> OutputStream into type classes, the advantage being that this makes
> streams more extensible - one example is that memory-mapped files fit
> neatly into this framework.  I already have 6 examples of things that
> can have streams layered on top (or *are* streams), and there are almost
> certainly more.
> 
> Here's some signatures for you to peruse:
> 
> class Stream s where
>       closeStream	   :: s -> IO ()
>       streamSetBuffering :: s -> BufferMode -> IO ()
>       streamGetBuffering :: s -> IO BufferMode
>       streamFlush	   :: s -> IO ()
>       isEOS		   :: s -> IO Bool
> 
> class InputStream s where
>       streamGet         :: s -> IO Word8
>       streamReadBuffer  :: s -> Integer -> Buffer -> IO ()
>       streamGetBuffer   :: s -> Integer -> IO ImmutableBuffer
>       streamGetContents :: s -> IO [Word8]
> 
> class OutputStream s where
>       streamPut         :: s -> Word8 -> IO ()
>       streamPuts        :: s -> [Word8] -> IO ()
>       streamWriteBuffer :: s -> Integer -> Buffer -> IO ()

I have some issues/queries with this.

1. The monad (and maybe also the "byte" type) should be parameterised. 
There are two ways of doing this.

General way:

  class InputStream s m | s -> m where
    streamGet :: s -> m Word8
    ...

Haskell 98-compatible way:

  class InputStream s where
    streamGet :: s m -> m Word8
    ...

Of course you'll need to be able to parameterise your buffer types, too.

2. streamWriteBuffer probably wants an ImmutableBuffer. The idea is that 
a "Buffer" is a special kind of "ImmutableBuffer" that also supports 
being set. Probably you should rename them. They're just flavours of 
"Array" anyway, aren't they?

3. I note all the class members have types of the form "s -> a" each for 
some "a" not dependent on "s". This means streams might be a candidate 
for data structures:

  data InputStream m = {
     isStream :: Stream m, 
     streamGet :: m Word8,
     streamReadBuffer :: Integer -> Buffer -> IO ()
     ...
  }

I'm not sure which is preferable however. Data-structure inheritance has 
to be done by hand (except see point 9., it might only be a close 
function), and they don't allow default implementations (yet).

4. Shouldn't "streamFlush" belong to OutputStream?

5. It might be useful to have a "streamAvailable" function in 
InputStream that gets all bytes immediately available without blocking.

6. Shouldn't "isEOS" belong to InputStream? Come to think of it, perhaps 
streamGet should return a (Maybe Word8) instead of a Word8.

7. OutputStream should have a streamSendEOS separate from streamClose. 
This is used in TCP, where one can say "I'm done sending" on the 
outgoing channel while still receiving octets on the incoming channel.

I would argue that for file streams, it would be appropriate for 
streamSendEOS to set the end of file at that point.

8. Buffering is a special thing, isn't it? Should input-buffering and 
output-buffering be separated? Should there be separate classes for 
buffering?

9. If you accept all this, all you have left in Stream is closeStream, 
which I would argue is all that "sources" and "sinks" really have in 
common. And perhaps some don't even need that. The class might be better 
named "Closable" or somesuch.

> -- Files in the filesystem, with access rights
> data File
> data FileInputStream  -- instance Stream, InputStream
> data FileOutputStream -- instance Stream, OutputStream

Would it be worth also providing a random-access-based interface to 
files in addition to a stream-based? I know either one can be built on 
the other. For instance:

  accessFile :: File -> IO FileAccess
  closeAccess :: FileAccess -> IO ()
  getFileLength :: FileAccess -> IO Integer;
  setFileLength :: FileAccess -> Integer -> IO ();
  readFile :: FileAccess -> Integer -> Integer -> IO ImmutableBuffer
  writeFile :: FileAccess -> Integer -> ImmutableBuffer -> IO ()

Of course, this might be better generalised as a class.

> -- URIs:
> data URIStream
> getURI :: URI -> IO URIStream
> instance InputStream URIStream

Eeesh. URIs hide a lot of semantics, which this rather glosses over. And 
isn't writing an FTP client hard?

-- 
Ashley Yakeley, Seattle WA