[Haskell-cafe] A 3 line program --> Reid, Don, Daniel

Richard O'Keefe ok at cs.otago.ac.nz
Mon Oct 26 20:12:53 EDT 2009


On Oct 25, 2009, at 5:01 PM, Curt Sampson wrote:
> Actually, you would be having the exact same issues with Java; in  
> UTF-8
> mode it would also choke on Latin-1.

Yes, but from the 'javac' man-page:

       -encoding encoding
               Sets    the    source    file    encoding    name,     
such    as
               EUCJIS/SJIS/ISO8859-1/UTF8.  If -encoding is not  
specified,  the
               platform default converter is used.

The corresponding part of the GHC documentation says

	GHC assumes that source files are ASCII or UTF-8 only,
	other encodings are not recognised.  However,
	invalid UTF-8 sequences will be ignored in comments,
	so it is possible to use other encodings such as Latin-1,
	as long as the non-comment source code is ASCII only.

There's no obvious reason why GHC couldn't support any source
encoding that the host's iconv() supports.

>  Blaming Haskell for this
> "problem" is quite unfair.

It is perfectly fair.  The problem is not that the original user
isn't telling GHC what the encoding is, but that GHC cannot be
told.  A javac-like -encoding switch on the command line would
meet the original need.
>
>
> (If all of this UTF-8 stuff seems annoying to you, consider that in
> ISO-8859-1 it's not possible to express the simplest Japanese word.

And why, exactly, should someone who has no Japanese words to express
even care?  You have explained why UTF-8 is a good *default*; that
does not make choosing it as the *only* option a good idea.




More information about the Haskell-Cafe mailing list