[Haskell-i18n] unicode notation \uhhhh implementation

16 Aug 2002 11:58:55 +0200

On Fri, 2002-08-16 at 10:26, Ketil Z. Malde wrote:
> Ashley Yakeley <ashley@semantic.org> writes:
> 
> > Sure, but bear in mind Unicode names for characters are quite long, for 
> > instance
> 
> >      GREEK SMALL LETTER THETA
> 
> Hmm...yes.  My personal preference would be something close to
> (La)TeX.  Although it is perhaps a bit niche, it *is* a standard lhs
> style (and one which I quite like, too).

I think there would be no harm in having the TeX names for convenience.

> Would we need to maintain the list manually, then?  Perhaps we could
> standardise Unicode names, but additionally maintain short synonyms it
> for greek letters and similar mathematical symbols, which I suspect
> are rather commonly used?

Would allowing the full Unicode names give an advantage? Something like
GREEK_SMALL_LETTER_THETA is almost half a line and might do more harm to
the code readability than uhhhh.

> > Right, but whatever it is it really should be an ASCII character: the 
> > point is to allow representation of all identifiers from 7-bit ASCII.
> 
> What's available, really?  "~!?$%.,^:;" are taken, along with quotes,
> numerical symbols and parens.  Are '#' and '&' still free?
> 
> Candidates I can think of might be:
> 
> 1        &alpha  -> similar to HTML entities
> 2        #alpha  -> possible problems with C preprocessor?

How about #uhhhh? There is no C preprocessor directive like that, so it
should be safe to run the unicode-preproc before cpp. The only thing is
that GHC uses # in identifiers and pragmas, as far as I can see. Can
someone comment?

Sven Moritz