[Haskell-cafe] Re: getting crazy with character encoding

David Roundy droundy at darcs.net
Thu Sep 13 10:39:28 EDT 2007


On Thu, Sep 13, 2007 at 06:49:59AM -0700, Stefan O'Rear wrote:
> On Thu, Sep 13, 2007 at 12:06:15PM +0200, Ketil Malde wrote:
> > On Wed, 2007-09-12 at 17:40 -0700, Stefan O'Rear wrote:
> > > On Thu, Sep 13, 2007 at 12:23:33AM +0000, Aaron Denney wrote:
> > > > Unfortunately, at this point it is a well entrenched bug, and changing
> > > > the behaviour will undoubtedly break programs.
> > > ...
> > > > There should be another system for getting the exact bytes in and 
> > > > out (as Word8s, say, rather than Chars), 
> > 
> > > I'm pretty sure Hugs does the right thing.
> > 
> > ..which makes me wonder what the right thing actually is?
> > 
> > Since IO on Unix (or at least on Linux) consists of bytes, I don't see
> > how a Unicode-only interface is ever going to do the 'right thing' for
> > all people.
> 
> I never said it was Unicode-only.
> 
> hGetBuf / hPutBuf - Raw Word8 access
> getChar etc       - Uses locale info

The problem is that the type of openFile and getArgs is wrong, so there's
no "right" way to get a Handle (other than stdin) to read from in the first
place, unless we're willing to allow the current weird behavior of treating
a [Char] as [Word8].
-- 
David Roundy
Department of Physics
Oregon State University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20070913/930c410c/attachment.bin


More information about the Haskell-Cafe mailing list