[Haskell-cafe] Re: PROPOSAL: New efficient Unicode string library.

Deborah Goldsmith dgoldsmith at mac.com
Tue Oct 2 18:20:00 EDT 2007


On Oct 2, 2007, at 3:01 PM, Twan van Laarhoven wrote:
> Lots of people wrote:
> > I want a UTF-8 bikeshed!
> > No, I want a UTF-16 bikeshed!
>
> What the heck does it matter what encoding the library uses  
> internally? I expect the interface to be something like (from my own  
> CompactString library):
> > fromByteString :: Encoding -> ByteString -> UnicodeString
> > toByteString   :: Encoding -> UnicodeString -> ByteString

I agree, from an API perspective the internal encoding doesn't matter.
>
> The only matter is efficiency for a particular encoding.

This matters a lot.
>
>
> I would suggest that we get a working library first. Either UTF-8 or  
> UTF-16 will do, as long as it works.
>
> Even better would be to implement both (and perhaps more encodings),  
> and then benchmark them to get a sensible default. Then the choice  
> can be made available to the user as well, in case someone has  
> specifix needs. But again: get it working first!

The problem is that the internal encoding can have a big effect on the  
implementation of the library. It's better not to have to do it over  
again if the first choice is not optimal.

I'm just trying to share the experience of the Unicode Consortium, the  
ICU library contributors, and Apple, with the Haskell community. They,  
and I personally, have many years of experience implementing support  
for Unicode.

Anyway, I think we're starting to repeat ourselves...

Deborah



More information about the Haskell-Cafe mailing list