[Haskell-cafe] Re: Writing binary files?

Simon Marlow simonmar at microsoft.com
Thu Sep 16 06:19:26 EDT 2004


On 16 September 2004 10:35, Glynn Clements wrote:

> Simon Marlow wrote:
> 
>>> Which is why I'm suggesting changing Char to be a byte, so that we
>>> can have the basic, robust API now and wait for the more advanced
>>> API, rather than having to wait for a usable API while people sort
>>> out all of the issues.
>> 
>> An easier way is just to declare that the existing API assumes a
>> Latin-1 encoding consistently.  Later we might add a way to let the
>> application pick another encoding, or request that the I/O library
>> uses the locale encoding.
> 
> But how do you do that without breaking stuff? If the application
> changes the encoding to UTF-8 (either explicitly, or by using the
> locale's encoding when it happens to be UTF-8), then code such as:
> 
> 	[filename] <- getArgs
> 	openFile filename ReadMode
> 
> will fail if filename isn't a valid UTF-8 sequence. Similarly for the
> other cases where the OS accepts/returns byte strings but the Haskell
> interface uses String.

And that's the correct behaviour, isn't it?

Actually I hadn't really considered filenames, I was just talking about
data read & written via the IO library.

> I'm less concerned about the handling of streams, as you can
> reasonably add a way to change the encoding before any data has been
> read or written. I'm more concerned about FilePaths, argv, the
> environment etc.

Yes, these are interesting issues.  Filenames are stored as character
strings on some OSs (eg. Windows) and byte strings on others.  So the
Haskell portable API should probably use String, and do decoding based
on the locale (if the programmer asks for it).

Argv and the environment - I don't know.  Windows CreateProcess() allows
these to be UTF-16 strings, but I don't know what encoding/decoding
happens between CreateProcess() and what the target process sees in its
argv[] (can't be bothered to dig through MSDN right now).  I suspect
these should be Strings in Haskell too, with appropriate
decoding/encoding happening under the hood.

Cheers,
	Simon


More information about the Haskell-Cafe mailing list