UTF-8 BOM, really!? (was: [Haskell-cafe] Re: File path progr amme)

Bayley, Alistair Alistair_Bayley at ldn.invesco.com
Mon Jan 31 06:04:15 EST 2005


> From: Aaron Denney [mailto:wnoise at ofb.net] 
> 
> Better yet would be to have the standard never allow the BOM.
> 
> Since some things can't handle it, on output we should never emit it,
> but still must handle it on input.  Bah.


I don't see how banning it from input would help; as I understand it, it's
meant to be ignored anyway, so there's no loss or gain there.

Whether or not to emit the BOM on output should be a user choice, surely?


> From: Graham Klyne [mailto:GK at ninebynine.org] 
> 
> How can it make sense to have a BOM in UTF-8?  UTF-8 is a sequence of 
> octets (bytes);  what ordering is there here that can 
> sensibly be varied?


>From http://www.unicode.org/faq/utf_bom.html#BOM :

"Q: Where is a BOM useful?

A: A BOM is useful at the beginning of files that are typed as text, but for
which it is not known whether they are in big or little endian format..."

i.e. it helps when you need to guess what the encoding of a given file might
be.


Alistair.

-----------------------------------------
*****************************************************************
Confidentiality Note: The information contained in this   message, and any
attachments, may contain confidential   and/or privileged material. It is
intended solely for the   person(s) or entity to which it is addressed. Any
review,   retransmission, dissemination, or taking of any action in
reliance upon this information by persons or entities other   than the
intended recipient(s) is prohibited. If you received  this in error, please
contact the sender and delete the   material from any computer.
*****************************************************************



More information about the Haskell-Cafe mailing list