DBCS encoding support on Windows

Simon Peyton-Jones simonpj at microsoft.com
Wed Apr 24 09:12:47 CEST 2013


Great stuff.

One thing: have you left enough documentation in the code that, when someone comes along in 3 years time, they can understand the problem and how you have dealt with it?  Lot of "Note [Blah]" stuff?  Or something.

Thanks

Simon

From: ghc-devs-bounces at haskell.org [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Max Bolingbroke
Sent: 23 April 2013 21:29
To: ghc-devs at haskell.org
Subject: DBCS encoding support on Windows

Hi GHCers,

I've implemented support in GHC for extra Windows code pages on the branch "dbcs" of the base library.

The problem this solves is that currently users of Haskell on a Windows machine running in a locale which uses a double-byte code page such as CP936 (GBK) or CP950 (Big5) cannot properly interact with the Windows console in their native language. Unfortunately code page support is a prerequisite for getting this to work correctly because for all Microsoft's fine talk about Unicode being the future, the Windows console does not seem to support it properly - code pages are the only way to go for console input and output.

As the standard Windows locale encodings in many regions, these code pages are also the predominant method of encoding text files in many countries, so they are useful outside the console.

The solution is along the lines suggested in http://hackage.haskell.org/trac/ghc/ticket/3977, i.e. we create an iconv-like interface to Window's MultiByteToWideChar and WideCharToMultiByte APIs by the judicious use of binary search. In my branch, these APIs will be used whenever we don't have a built-in native Haskell TextEncoding for the code page (we used to fall back on using latin1 for such code pages).

Unless there are any objections I'll merge this into the base library main branch next week.

Cheers,
Max
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130424/07f2bda6/attachment-0001.htm>


More information about the ghc-devs mailing list