[Haskell-cafe] Re: PROPOSAL: New efficient Unicode string library.
Deborah Goldsmith
dgoldsmith at mac.com
Tue Oct 2 18:20:00 EDT 2007
On Oct 2, 2007, at 3:01 PM, Twan van Laarhoven wrote:
> Lots of people wrote:
> > I want a UTF-8 bikeshed!
> > No, I want a UTF-16 bikeshed!
>
> What the heck does it matter what encoding the library uses
> internally? I expect the interface to be something like (from my own
> CompactString library):
> > fromByteString :: Encoding -> ByteString -> UnicodeString
> > toByteString :: Encoding -> UnicodeString -> ByteString
I agree, from an API perspective the internal encoding doesn't matter.
>
> The only matter is efficiency for a particular encoding.
This matters a lot.
>
>
> I would suggest that we get a working library first. Either UTF-8 or
> UTF-16 will do, as long as it works.
>
> Even better would be to implement both (and perhaps more encodings),
> and then benchmark them to get a sensible default. Then the choice
> can be made available to the user as well, in case someone has
> specifix needs. But again: get it working first!
The problem is that the internal encoding can have a big effect on the
implementation of the library. It's better not to have to do it over
again if the first choice is not optimal.
I'm just trying to share the experience of the Unicode Consortium, the
ICU library contributors, and Apple, with the Haskell community. They,
and I personally, have many years of experience implementing support
for Unicode.
Anyway, I think we're starting to repeat ourselves...
Deborah
More information about the Haskell-Cafe
mailing list