I Hate IO

Ashley Yakeley ashley@semantic.org
Mon, 13 Aug 2001 18:46:49 -0700


At 2001-08-13 08:29, Marcin 'Qrczak' Kowalczyk wrote:

>Fri, 10 Aug 2001 18:31:24 -0700, Ashley Yakeley <ashley@semantic.org> pisze:
>
>> I can only tell you what's useful for me as an API user. Since there are 
>> peculiar little differences between different kinds of sockets, my ideal 
>> would be every type of socket to have its own API functions, and then 
>> common patterns brought together with classes.
>
>It makes harder to program polymorphically. In particular having
>a heterogeneous datastructure requires wrapping different types
>in either function closures or existential types, thus awkwardly
>emulating the current interface.

Do you have an example? Can't you simply substitute something like 
"(Connection c) => c" for "Handle"?

>Items which are used interchangeably shouldn't have different types
>in a language like Haskell which doesn't provide subtyping. 

Isn't that what classes are for? I've noticed that the more I use 
Haskell, the less I miss subtyping... classes and "data" unions are 
almost always sufficient. But perhaps you have an example which would 
really want subtyping, given an API for which every type of connection 
handle had its own type.

...
>> But at the very least, separate out files from sockets. I'm really
>> not sure why files have a "file pointer" stream-based API at all.
>
>Because it's convenient. Files are usually read sequentially.

Files are mostly read in their entirety: but sometimes they are read in 
sequentially (e.g. parsers), and very occasionally read in "randomly" 
(e.g. pointer-based formats such as TIFF, or databases and the like).

My solution would be to have the API "random-access" based, and then have 
a type that provided the current stream-based interface (which I would 
use for your example):

--
data FileStream = MkFileStream
     {
     fsFile :: OpenFile,
     fsPointer :: Integer
     }

instance Closable FileStream where
     ...
instance Source FileStream where
     ...
instance Sink FileStream where
     ...
--

With the current API, one can provide a "random-access" based interface 
something like this:

--
data OpenFile = MkOpenFile
     {
     ofHandle :: Handle,
     ofLock :: ThreadLock -- or whatever
     }

readFileBlock :: OpenFile -> Integer -> Integer -> IO [Word8]
readFileBlock file start length = synchronized (ofLock file) do
     {
     let {handle = ofHandle file};
     hSeek handle AbsoluteSeek start;
     ... repeatedly call hGetChar handle ...
     }

... etc.
--

...provided a way of shortening a file was provided and the API used 
Word8s instead of Chars. This just seems a whole lot uglier.

I don't know much about Haskell's multithreaded support, but if one 
wanted multiple threads using the same file, you'd need some kind of 
thread-locking (as suggested above). By contrast, with a "random-access" 
API it might even be possible to allow multiple simultaneous read 
operations.

But I admit my preference is essentially aesthetic. Files are by nature 
accessible at any point, but we choose to pretend that they are streams 
(because that's convenient most of the time) and then we have to jump 
through hoops to make the streaminess go away when want actual direct 
access to arbitrary offsets.

My preference for libraries is for interfaces that reveal the true nature 
of the object as accurately as possible, and to make explicit all the 
little patterns and conversions that programmers like to use. I believe 
Haskell is powerful enough to do this cleanly even without subtyping... 
the advantage is essentially the same advantage as strong typing: more 
errors are caught at compile-time. You can no longer write to standard 
input, or seek on a TCP connection, or get the remote IP address of an 
AppleTalk connection, etc.


-- 
Ashley Yakeley, Seattle WA