>>> All software I write professional have to support 40 languages
>>> (including CJK ones) so I would prefer UTF-16 in case I could use
>>> Haskell at work some day in the future. I dunno that who uses what
>>> encoding the most is good grounds to pick encoding though. Ease of
>>> implementation and speed on some representative sample set of  
>>> text may
>>> be.
>> UTF-8 supports CJK languages too.  The only question is efficiency
> Due to the additional complexity of handling UTF-8 -- EVEN IF the  
> actual text processed happens all to be US-ASCII -- will UTF-8  
> perhaps be less efficient than UTF-16, or only as fast?

UTF8 will be very slightly faster in the all-ASCII case, but quickly  
blows chunks if you have *any* characters that require multibyte.   
Given the way UTF8 encoding works, this includes even Latin-1 non- 
ASCII, never mind CJK.  (I think people have been missing that  
point.  UTF8 is only cheap for 00-7f, *nothing else*.)

