[Haskell-cafe] PROPOSAL: New efficient Unicode string library.
Twan van Laarhoven
twanvl at gmail.com
Mon Sep 24 19:08:38 EDT 2007
Johan Tibell wrote:
> Dear haskell-cafe,
>
> I would like to propose a new, ByteString like, Unicode string library
> which can be used where both efficiency (currently offered by
> ByteString) and i18n support (currently offered by vanilla Strings)
> are needed. I wrote a skeleton draft today but I'm a bit tired so I
> didn't get all the details. Nevertheless I think it fleshed out enough
> for some initial feedback. If I can get the important parts nailed
> down before Hackathon I could hack on it there.
>
> Apologies for not getting everything we discussed on #haskell down in
> the first draft. It'll get in there eventually.
>
> Bring out your Unicode kung-fu!
>
> http://haskell.org/haskellwiki/UnicodeByteString
Have you looked at my CompactString library[1]? It essentially does
exactly this, with one extension: the type is parameterized over the
encoding. From the discussion on #haskell it would seem that some people
consider this unforgivable, while others consider it essential.
In my opinion flexibility should be more important, you can always
restrict things later. For the common case where encoding doesn't matter
there is Data.CompactString.UTF8, which provides an un-parameterized
type. I called this type 'CompactString' as well, which might be a bit
unfortunate. I don't like the name UnicodeString, since it suggests that
the normal string somehow doesn't support unicode. This module could be
made more prominent. Maybe Data.CompactString could be the specialized
type, while Data.CompactString.Parameterized supports different encodings.
A word of warning: The library is still in the alpha stage of
development. I don't fully trust it myself yet :)
[1] http://twan.home.fmf.nl/compact-string/
Twan
More information about the Haskell-Cafe
mailing list