You definitely need to get to the octets when implementing communications protocols. Does this mean that if I send or receive certain octets or sequences of octets these will be converted into something else?


> 2. A file is not made of "Char"s. A file is made of octets ("bytes"),
> i.e. Word8s. What is a "Char" anyway? Sometimes it's a seven- or
> eight-bit quantity with a _vague_ implication of interpretation as
> textual character; sometimes it's a 16-, 20.087- or 31-bit
> quantity with
> a much stronger implication of interpretation as textual character
> (strictly, Unicode "codepoint"). Is an ASCII 'r' the same as
> 'r'? Or is an ASCII code 57 the same as an EBCDIC code 57?
> As for streams, mostly they are streams of octets. But of
> course streams
> of anything might be useful.

There's an implicit conversion step, between whatever is the on-disk
encoding of character streams and Unicode.  GHC currently only supports
a straightforward ISO 8851 encoding.

I agree there ought to be a way to get at the raw bytes too.

