[Haskell-cafe] Encoding of Haskell source files

Tako Schotanus tako at codejive.org
Tue Apr 5 03:23:57 CEST 2011


On Mon, Apr 4, 2011 at 17:51, Yitzchak Gale <gale at sefer.org> wrote:

> malcolm.wallace wrote:
> >> BOM is not part of UTF8, because UTF8 is byte-oriented.  But
> applications
> >> should be prepared to read and discard it, because some applications
> >> erroneously generate it.
>
> For maximum portability, the standard should be require compilers
> to accept and discard an optional BOM as the first character of a
> source code file.
>
> Tako Schotanus wrote:
> > That's not what the official unicode site says in its FAQ:
> > http://unicode.org/faq/utf_bom.html#bom4 and
> http://unicode.org/faq/utf_bom.html#bom5
>
> That FAQ clearly states that BOM is part of some "protocols".
> It carefully avoids stating whether it is part of the encoding.
>
> It is certainly not erroneous to include the BOM
> if it is part of the protocol for the applications being used.
> Applications can include whatever characters they'd like, and
> they can use whatever handshake mechanism they'd like to
> agree upon an encoding. The BOM mechanism is common
> on the Windows platform. It has since appeared in other
> places as well, but it is certainly not universally adopted.
>
> Python supports a pseudo-encoding called "utf8-bom" that
> automatically generates and discards the BOM in support
> of that handshake mechanism But it isn't really an encoding,
> it's a convenience.
>
> Part of the source of all this confusion is some documentation
> that appeared in the past on Microsoft's site which was unclear
> about the fact that the BOM handshake is a protocol adopted
> by Microsoft, not a part of the encoding itself. Some people
> claim that this was intentional, part of the "extend and embrace"
> tactic Microsoft allegedly employed in those days in an effort
> to expand its monopoly.
>
> The wording of the Unicode FAQ is obviously trying to tip-toe
> diplomatically around this issue without arousing the ire of
> either pro-Microsoft or anti-Microsoft developers.
>
>
Some reliable sources for all this would be entertaining (although
irrelevant for the rest of this discussion).

Cheers,
 -Tako
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110405/c6b7a88d/attachment.htm>


More information about the Haskell-Cafe mailing list