[Haskell-i18n] Unicode in source

Sven Moritz Hallberg pesco@gmx.de
22 Aug 2002 17:49:28 +0200


On Wed, 2002-08-21 at 23:55, Glynn Clements wrote:
> The other interpretation is that all glyphs have widths which are an
> integral number of "columns". Western (latin, cyrillic, Greek)
> characters are a single column wide, while CJK characters are
> typically two columns wide. The (Unix98) wcwidth() function can be
> used to obtain the width (in columns) of a given wide character
> (wchar_t) in the current locale.

I see, I wasn't aware of this, thanks for pointing it out. In this case,
we should get some way of obtaining the width in columns of a Char in
Haskell and let the layout rule talk about columns, correct?


> Character I/O functions should probably ignore composition, i.e.
> LATIN_SMALL_LETTER_A + COMBINING_ACUTE_ACCENT should appear as two
> separate characters to the application.
> 
> However, layout will only "work" if the compiler (or is it a
> preprocessor?) uses the same algorithm as the editor. If the editor
> shows a composition sequence as a single character cell, it needs to
> be treated as a single column for the purposes of layout.

Can the composition characters stand alone at all? If there's no (strong
enough) reason to believe they will ever be meant to count as an extra
column in the layout rule, we just have to decide whether we want to
require compilers to recognize them.

Ashley, do your property tools include something that can handle
composition?


Regards,
Sven Moritz