[Haskell] Streams: the extensible I/O library

Bulat Ziganshin bulatz at HotPOP.com
Fri Feb 10 08:07:08 EST 2006


Hello Andrew,

Wednesday, February 08, 2006, 8:24:59 PM, you wrote:

>> AP> Bulat, it wouldn't hurt to include a motivation section at the top.  As
>> AP> I understand, it's ultimately all about speed, right?  Otherwise, we
>> AP> would all be happy with lists (and unsafeInterleave*).  So maybe a
>> AP> comparison between Stream and [] should be given.
>> 
>> you guessed wrong :)  this library is ultimately about replacing
>> System.IO library (i.e. Handles)

AP> Let me rephrase my question:  Why not just reimplement the Handles API
AP> (with some extensions like binary IO)?  Is there really a need to use a
AP> handle-like API for more than real IO?  If so, what is the need,
AP> expressivity or performance (or both)?  Maybe a use case showing what
AP> you can do with your library, and how you would have to do it otherwise?

now i understood you. actually, my presentation was meant only for a
few people who are already know about limitations of current library,
who are already requested additional features but don't got it. here i
need to give some history:

when a System.IO interface was developed, it implements much less
features than now, and its implementation was enough simple and
straightforward. as time goes, the more and more features was added to
this library: complex buffering scheme, several async i/o
implementations, locking, networking. And at current moment, GHC's
System.IO implementation consists of about 3000 lines with a rather
monolithic structure. you can't easily add new feature or make using
of some "heavyweight" feature optional because it needs to make
changes through the entire library. As the result, GHC users can't get
improvements in this library, required for his work. Some of them even
develop his own libraries what implements just that they need. for
example, Einar Karttunen developed networking library with advanced
async i/o and support for i/o of fast packed strings. But such
solutions is not universal - his library can't be used for file i/o,
fo example, although the code is essentially the same.

what i done? the main merit of Streams library is not implementation
of any particular feature, but birth of framework for the I/O
sub-libraries. and my library essentially is just a collection of such
sublibs. first, for example, implements file i/o, second implements
buffering, third - utf-8 encoding, and so on. the most important
property of all these sublibs is that no one of them is greater than
300 lines. that means that it is far easier to understand, modify or
even replace any of them. and that will have no impact to other part
of library because all these sublibs binded together not via data
fields, but with well defined interfaces

now, implementing any new I/O feature or new I/O source means only
implementing Stream class-comforming interface - all other features,
including locking, buffering, encoding, compression, serialization,
binary and text i/o, async i/o, will become available automatically.
the same for transformers - once implemented gzip compression or
UTF-16 encoding support will become automatically available for all
the I/O sources, present and future. is not that great? :)  moreover,
user apllication or third party lib can easily implement new stream
types or transformers without bothering the original library.

so, this lib in some aspect is meta-meta-instrument, whose capital
will be automatically increasing as new sublibs will appear. just at
this moment its advantages over System.IO is in the following areas:

faster i/o
support for optional utf-8 encoding
binary i/o and serialization
user-controlled locking

if you will look inside the archive, you will find directory Examples,
which demonstrates usage of all these features. as i said in docs, i
also plan to implement other user requests to the System.IO library

another consequence of emerging this library is that all these
features will become available on Hugs and other Haskell compilers,
that never had enough man resources to develop such great library as
GHC's System.IO.

and about using streams in monads other than IO. i really don't know,
whether it will be used or not. at least, for seriazliation it looks
promising. for example, there is functions "encode" and "decode" that
is like show/read pair, but implements binary encoding according to
the instances of Binary class. and of course, it is implemented
through the StringBuffer instance of Stream class, working in the ST
monad.

comparing to the Handles, library provides essentially the same
interface. again, you can find information about swithcing from
Handles to Streams in doc. i plan to provide in future "legacy layer"
which will emulate 100% of System.IO interface, but use the streams
internally. It will be essential for old apps, especially if Streams
will become official and sole GHC i/o library

about internal organization - Streams is somewhat like that if Simon
himself splitted up Handles library to the small independent parts and
then replaced part of them with simpler/faster implementations.
nothing more, except for common Stream class interface, developed by
John Goerzen. my work was mainly to bring the best ideas together :)

-- 
Best regards,
 Bulat                            mailto:bulatz at HotPOP.com





More information about the Haskell mailing list