[Haskell-cafe] Re: getting crazy with character encoding

Stefan O'Rear stefanor at cox.net
Wed Sep 12 20:40:17 EDT 2007


On Thu, Sep 13, 2007 at 12:23:33AM +0000, Aaron Denney wrote:
> Unfortunately, at this point it is a well entrenched bug, and changing
> the behaviour will undoubtedly break programs.
...
> There should be another system for getting the exact bytes in and out
> (as Word8s, say, rather than Chars), and there are in fact external
> libraries using lower level interfaces, rather than the things like
> putStr, getLine, etc. that do this.  An external library works, of
> course, but it should be part of the standard so implementors know that
> character based routines actually are character based, not byte based.
...
> I don't know what NHC and hugs do, though I assume they also provide
> no translations.  I'm also not sure what JHC does, though I do see
> mentions of UTF-8, UTF-16 (for windows), and UTF-32 (for internal usage
> of C libraries), and I do know that John is fairly careful about locale
> issues.

I'm pretty sure Hugs does the right thing.  NHC is probably broken.  In
any case, we already have hGetBuf / hPutBuf in the standard base
libaries for raw binary IO, so code that uses getChar for bytes really
has no excuse.  We can and should fix the bug.

Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20070912/c03a55d4/attachment.bin


More information about the Haskell-Cafe mailing list