UTF-8 decoding error
Jan-Willem Maessen
jmaessen at alum.mit.edu
Wed Sep 27 15:27:03 EDT 2006
On Sep 27, 2006, at 6:05 AM, Matthew Pocock wrote:
> Fortress (sun's possibly-not-vaporware hpc language) supports
> arbitrary
> unicode chars in code, and has an escape syntax for commonly used
> things.
I have spent the past week writing Fortress code (which runs in
parallel, even). But I'm perhaps a special case. :-)
> Similarly, proof-general/isabelle supports tex-style escapes for
> symbols &
> greek. It seems to me that a pre-processor that turns human-
> friendly escapes
> (e.g. \{lambda} rather than some magic number) into unicode and a
> slightly
> intelligent IDE (or emacs mode?) would go most of the way to
> letting us use
> up-side-down ys and curly as with all the visual beauty and editor
> niceness
> that we have now with ascii.
In Fortress we spent a *lot* of effort making the "TWiki" syntax as
painless as possible for stuff which we planned to use often (for
example, -> and => turn into Unicode arrows, and the language syntax
is defined in terms of them). One source of both encouragement and
frustration is the fact that every unicode code point has an
associated description. We support using these descriptions---and
various shortenings of them, since they are too verbose for day-to-
day use. The frustration is that the names or their shortenings are
not necessarily unique. For characters which only occur in strings
this is less critical, but a little effort will go a long way.
One heuristic we've used is: "if I do a diff on the ASCII
representation provided by my version control system, will I be able
to read the result?"
We of course have a little program which processes an official
unicode character table (downloaded from the web) plus some
information about our special cases and uses it to generate the
appropriate conversion functions. This is important because Unicode
is constantly changing (mostly getting bigger).
-Jan-Willem Maessen
Fortress developer, Haskell hacker
>
> Matthew
>
> On Wednesday 20 September 2006 21:42, Duncan Coutts wrote:
>> On Wed, 2006-09-20 at 18:14 +0200, Christian Maeder wrote:
>>> How can I convince ghc version 6.5.20060919 to accept latin1
>>> characters
>>> in literals?
>>>
>>> I wish to keep source files (containing umlauts in strings) that
>>> can be
>>> compiled by either ghc-6.4.2 and ghc-6.6.
>>
>> You can use numeric escapes like "\222".
>>
>> Duncan
>>
>> _______________________________________________
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users at haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2425 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20060927/d608bc9c/smime.bin
More information about the Glasgow-haskell-users
mailing list