[Haskell-cafe] Re: PROPOSAL: New efficient Unicode string library.

Wed Oct 3 02:30:13 EDT 2007

On Tue, 2007-10-02 at 21:45 -0400, Brandon S. Allbery KF8NH wrote:

> > Due to the additional complexity of handling UTF-8 -- EVEN IF the  
> > actual text processed happens all to be US-ASCII -- will UTF-8  
> > perhaps be less efficient than UTF-16, or only as fast?

> UTF8 will be very slightly faster in the all-ASCII case, but quickly  
> blows chunks if you have *any* characters that require multibyte.   

What benchmarks are you basing this on?  Doubling your data size is
going to cost you if you are doing simple operations (searching, say),
but I don't see UTF-8 being particularly expensive - somebody (forget
who) implemented UTF-8 on top of ByteString, and IIRC, the benchmarks
numbers didn't change all that much from the regular Char8.

-k