new i/o library

Fri Feb 3 06:28:12 EST 2006

On 03 February 2006 08:34, Bulat Ziganshin wrote:

>>> moreover - we can implement locking as special "converter" type,
>>> that can be applied to any mutable object - stream, collection,
>>> counter. that allows to simplify implementations and add locking
>>> only to those Streams where we really need it. like these:
>>> 
>>> h <- openFD "test"
>>>        >>= addUsingOfSelect
>>>        >>= addBuffering 65536
>>>        >>= addCharEncoding utf8
>>>        >>= attachUserData dictionary
>>>        >>= addLocking
> 
>> This is really nice - exactly what I'd like to see in the I/O
>> library. The trick is making it perform well, though...  but I'm
>> sure that's your main focus too.
> 
> basically idea is very simple - every stream implements the Stream
> interface, what is clone of Handle interface. Stream transformers is
> just a types what has Stream parameters and in turn implement the same
> interface. all Stream operations are translated to calls of inner
> Stream. typical example:
> 
> data WithLocking h = WithLocking h (MVar ())

There's a choice here; I did it with existentials:

   data ThreadSafeStream = forall h . Stream h => TSStream h !(MVar ())
   instance Stream ThreadSafeStream where ...

What are the tradeoffs?  Well, existentials aren't standard for one
thing, but what about performance?  Every stream operation on the outer
stream translates to a dynamic call through the dictionary stored in the
stream.  Lots of layers means lots of dynamic calls, which probably
won't be efficient.

What about compared to your version:

> instance (Stream IO h) => Stream IO (WithLocking h) where

so a stream might have a type like this:

  WithLocking (WithCharEncoding (WithBuffer FileStream))

and calling any overloaded stream operation will have to build a *new*
dictionary as deep as the layering.  GHC might be able to share these
dictionaries across multiple calls within a function, but I bet you'll
end up building dictionaries a lot.  Compare this with the existential
version, which builds the dictionary once per stream.

On the other hand, you can also use {-# SPECIALISE #-} with your version
to optimise common combinations of layers.  I don't know if there's a
way to get specialisation with the existential version, it seems like a
high priority though, at least for the layers up to the buffering layer.

Also, your version is abstracted over the monad, which is another layer
to optimise away (good luck :-).

>> Still, I'm not sure that putting both input and output streams in the
>> same type is the right thing, I seem to remember a lot of things
>> being simpler with them separate.
> 
> i'm interested to hear that things was simpler? in my design it seems
> vice versa - i will need to write far more code to separate read,
> write and read-write FDs, for example. may be, the difference is
> because i have one large Stream class that implements all the
> functions while your design had a lot of classes with a few functions
> in each

Not dealing with the read/write case makes things a lot easier.
Read/write files are very rare, I don't think there's any problem with
requiring the file to be opened twice in this case.  Read/write sockets
are very common, of course, but they are exactly the same as separate
read & write streams because they don't share any state (no file
pointer).

Having separate input/output streams means you have to do less checking,
so perforamnce will be better, and there are fewer error cases.  Each
class has fewer methods, again better for performance.  The types are
more informative, and hence more useful.  Also, you can do cool stuff
like:

-- | Takes an output stream, and returns an input stream that will yield
-- all the data that is written to the output stream.
streamOutputToInput :: (OutputStream s) => s -> IO StreamInputStream

-- | Takes an output stream and an input stream, and pipes all the
-- data from the former into the latter.
streamConnect :: (OutputStream o, InputStream i) => o -> i -> IO ()

Sure you can do these with one Stream class, but the types aren't nearly
as nice.

Oh, one more thing: if you have a way to turn a ForeignPtr into a
Stream, then this can be used both for mmap'd files and for turning
(say) a PackedString into a Stream.

Cheers,
	Simon