[Haskell-cafe] Re: String vs ByteString
nfjinjing at gmail.com
Wed Aug 18 01:42:27 EDT 2010
> John Millikin wrote:
>> The reason many Japanese and Chinese users reject UTF-8 isn't due to
>> space constraints (UTF-8 and UTF-16 are roughly equal), it's because
>> they reject Unicode itself.
> This is the thing Unicode advocates don't want to admit. Until Unicode has
> code points for _all_ Chinese and Japanese characters, there will be active
> resistance to adoption.
> Live well,
For mainland chinese websites:
Most that became popular during web 1.0 (5-10 years ago) are using
utf-8 incompatible format, e.g. gb2312.
They didn't switch to utf-8 probably just because they never have to.
However, many of the popular websites started during web 2.0 are adopting utf-8
* renren.com (chinese largest facebook clone)
* www.kaixin001.com (chinese second largest facebook clone)
* t.sina.com.cn (an example of twitter clone)
These websites adopted utf-8 because (I think) most web development
tools have already standardized on utf-8, and there's little reason
I'm not aware of any (at least common) chinese characters that can be
represented by gb2312 but not in unicode. Since the range of gb2312 is
a subset of the range of gbk, which is a subset of the range of
gb18030. And gb18030 is just another encoding of unicode.
More information about the Haskell-Cafe