[Haskell-cafe] Ready for testing: Unicode support for Handle I/O
John Goerzen
jgoerzen at complete.org
Tue Feb 3 22:49:10 EST 2009
Duncan Coutts wrote:
> Sorry, I think we've been talking at cross purposes.
I think so.
>> There always has to be *some* conversion from a 32-bit Char to the
>> system's selection, right?
>
> Yes. In text mode there is always some conversion going on. Internally
> there is a byte buffer and a char buffer (ie UTF32).
>
>> What exactly do we have to do to avoid the penalty?
>
> The penalty we're talking about here is not the cost of converting bytes
> to characters, it's in switching which encoding the Handle is using. For
> example you might read some HTTP headers in ASCII and then switch the
> Handle encoding to UTF8 to read some XML.
Simon referenced a 30% penalty. Are you saying that if we read from a
Handle using the same encoding that we used when we first opened it,
that we won't see any slowdown vs. the system in 6.10?
> Switching the Handle encoding has a penalty. We have to discard the
> characters that we pre-decoded and re-decode the byte buffer in the new
> encoding. It's actually slightly more complicated because we do not
Got it. That makes sense, as does the decision to optimize for the more
common (not switching the encoding) case.
-- John
More information about the Haskell-Cafe
mailing list