[Haskell'-private] pragmas and annotations (RE: the record
system)
Ben Rudiak-Gould
Benjamin.Rudiak-Gould at cl.cam.ac.uk
Tue Feb 28 16:07:37 EST 2006
Malcolm Wallace wrote:
> * If the first three bytes of the file are "{-#", then keep reading in
> ASCII/Latin-1/whatever until you discover an ENCODING decl (or not).
>
> * If the first six bytes of the file are one of the two possible
> UTF-16 representations of "{-#", then assume UTF-16 with that
> byte-encoding until we find the ENCODING decl. (A missing decl in
> this case would be an error.)
>
> * If the first twelve bytes of the file are a UCS-4 representation of
> "{-#" then ... you get the picture.
>
> * For UTF-16 and UCS-4 variations, you must also permit the file to
> begin with an optional byte-order mark (two or four bytes).
You'd also want to look for the UTF-8 BOM, which is very common in Windows.
As for literate source, I suppose you could forbid .lhs files from using
UTF-16 or UCS-32 unless there's a BOM. Then unlit wouldn't need to know the
encoding (I think), and the .hs heuristics would work on the output.
-- Ben
More information about the Haskell-prime
mailing list