[Haskell-cafe] Re: Writing binary files?

Glynn Clements glynn.clements at virgin.net
Thu Sep 16 05:35:05 EDT 2004


Simon Marlow wrote:

> > Which is why I'm suggesting changing Char to be a byte, so that we can
> > have the basic, robust API now and wait for the more advanced API,
> > rather than having to wait for a usable API while people sort out all
> > of the issues.
> 
> An easier way is just to declare that the existing API assumes a Latin-1
> encoding consistently.  Later we might add a way to let the application
> pick another encoding, or request that the I/O library uses the locale
> encoding.  

But how do you do that without breaking stuff? If the application
changes the encoding to UTF-8 (either explicitly, or by using the
locale's encoding when it happens to be UTF-8), then code such as:

	[filename] <- getArgs
	openFile filename ReadMode

will fail if filename isn't a valid UTF-8 sequence. Similarly for the
other cases where the OS accepts/returns byte strings but the Haskell
interface uses String.

Currently, the use of String for byte strings doesn't cause problems
because decoding using ISO-8859-1 can't fail. Allowing the use of a
fallible decoder introduces a new set of issues.

E.g. what happens if you call getDirectoryContents for a directory
which contains filenames which aren't valid in the current encoding? 
Does the call fail outright, or are invalid entries silently omitted?

I'm less concerned about the handling of streams, as you can
reasonably add a way to change the encoding before any data has been
read or written. I'm more concerned about FilePaths, argv, the
environment etc.

-- 
Glynn Clements <glynn.clements at virgin.net>


More information about the Haskell-Cafe mailing list