Word8-Based IO

Ketil Z. Malde ketil@ii.uib.no
21 Aug 2002 09:52:24 +0200


Ashley Yakeley <ashley@semantic.org> writes:

> Was there any kind of consensus emerging on this?

I think we agreed, more or less, on low-level Word8-based IO, and Char
based functions on top of this doing encoding/decoding.  (Possibly
implementing Char functions natively for speed, but I suggest delaying
that until it becomes a problem)

> Leaving aside what might be done with the Char-based functions (and
> people seem to think they're fine as is) 

Well, clearly, they need to be able to incorporate the encodings
people are going to use.  Just stuffing in Word8's works fine for now,
but we should probably reimplement in terms of Word8 IO, and after
that, wrap at least an UTF-8 coder.  (I occasionally need something
to translate UTF-8 mail to ISO-8859-1, this sounds like a simple way
to get a feel for Unicode IO, and is probably a good alternative to
upgrading Gnus :-) 

> I'd like to suggest the following: 

> * System.IO: addition of new Word8-based functions
> 
>   hGetOctet :: Handle -> IO Word8
>   hLookAheadOctet :: Handle -> IO Word8
>   hPutOctet :: Handle -> Word8 -> IO ()
>   hPutArray :: Handle -> [Word8] -> IO ()
>   hLazyGetArray :: Handle -> IO [Word8]
> 
> ...as per hGetChar, hLookAhead, hPutChar, hPutStr and hGetContents.

I've no critique of the actual functions, this seems pretty
straightforward.  I don't particularly like 'octet' (probably because
I find it a bit pedantic, and it reminds me too much of committee
decisions), and would prefer hGetWord8 or even hGetByte/hGetWord.

Also, I find 'Array' a bit uncomfortable -- it's a list, isn't it?
But why not simply:

        hPutOctet (or Word, Word8, Byte) :: Handle -> Word8 -> IO ()
        hPutOctets (hPutWords, Bytes) :: Handle -> [Word8] -> IO ()
and
        hGetOctet :: Handle -> IO Word8
        hGetOctets :: Handle -> IO [Word8] -- hGetContents-alike

(I'd rather annotate the strict versions, lazy is usually the default)

I'd also like a 'readFile'-like function for octets, I find my
programs do a lot of their IO by lazily reading files, and being able
to bury all the handle-stuff is a good thing, IMHO.

>   hGetArrayBlock :: Handle -> Int -> IO [Word8]
>   hGetArrayReady :: Handle -> Int -> IO [Word8]

(Do these represent anything from Char IO?  It seems the latter is
intended for things like sockets, but (and here I display my ignorance
even more than usual) can a Handle really represent a socket?)

> * Network.Socket

> The send & receive functions don't do any kind of character 
> interpretation do they? I suggest these functions be given new types (and 
> probably new names, as I suppose we'll need the old ones for 
> compatibility):

I like the names, as they are taken more or less directly from the C
libraries.  I agree they shouldn't do any translation, though, so if
the names have to go, so be it.

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants