Unicode windows console output.
David Sankel
camior at gmail.com
Thu Nov 4 14:47:53 EDT 2010
On Thu, Nov 4, 2010 at 6:09 AM, Simon Marlow <marlowsd at gmail.com> wrote:
> On 04/11/2010 02:35, David Sankel wrote:
>
>> On Wed, Nov 3, 2010 at 9:00 AM, Simon Marlow <marlowsd at gmail.com
>> <mailto:marlowsd at gmail.com>> wrote:
>>
>> On 03/11/2010 10:36, Bulat Ziganshin wrote:
>>
>> Hello Max,
>>
>> Wednesday, November 3, 2010, 1:26:50 PM, you wrote:
>>
>> 1. You need to use "chcp 65001" to set the console code page
>> to UTF8
>> 2. It is very likely that your Windows console won't have
>> the fonts
>> required to actually make sense of the output. Pipe the
>> output to
>> foo.txt. If you open this file in notepad you will see the
>> correct
>> characters show up.
>>
>>
>> it will work even without chcp. afaik nor ghc nor windows
>> adjusts text
>> being output to current console codepage
>>
>>
>> GHC certainly does. We use GetConsoleCP() when deciding what code
>> page to use by default - see
>> libraries/base/GHC/IO/Encoding/CodePage.hs.
>>
>>
>>
>> This can actually be quite helpful. I've discovered that if you have a
>> console set to code page 65001 (UTF-8) and use WriteConsoleA (the
>> non-wide version) with UTF-8 encoded strings, the console displays the
>> text properly!
>>
>> So the solution seems to be, when outputting to a utf8 console use
>> WriteConsoleA.
>>
>
> We need someone to rewrite the IO library backend for Win32. Currently it
> is going via the msvcrt POSIX emulation layer, i.e. using write() and
> pseudo-file-descriptors. More than a few problems have been caused by this,
> and it's totally unnecessary except that we get to share some code between
> the POSIX and Windows backends. We ought to be using the native Win32 APIs
> and HANDLE directly, then we could use WriteConsoleA.
>
It looks like replacing the POSIX layer isn't necessary to fix the Unicode
console output bug. I've made a ticket and in a comment I illustrate the
_setmode call that magically makes everything work:
http://hackage.haskell.org/trac/ghc/ticket/4471
I could attempt a ghc patch for this, but I don't have any experience with
the ghc code. Perhaps someone could add this _setmode call with relative
ease?
David
--
David Sankel
Sankel Software
www.sankelsoftware.com
585 617 4748 (Office)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20101104/9be051cd/attachment.html
More information about the Glasgow-haskell-users
mailing list