[Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

John Meacham john at repetae.net
Thu Feb 8 20:11:45 EST 2007


On Mon, Feb 05, 2007 at 01:14:26PM +0100, Twan van Laarhoven wrote:
> The reason for inventing my own encoding is that it is easier to use and 
> takes less space than UTF-8. The only advantage UTF-8 has is that it can 
> be read and written directly. I guess this is a trade off, faster 
> manipulation and smaller storage compared to simpler and faster io. I 
> have not benchmarked it either way, so it is just guesswork for now.

I would highly highly recommend using utf8. inventing new formats
without very clear and pervasive benefits is just not good practice and
I wouldn't want to see it in standard libraries.

the ability for conversion between utf8 and ascii bytestrings and
compactstrings being a nop should not be underestimated. 

not to mention that utf8 was designed so things like sorting a raw
bytestring with utf8 in it produces the exact same result as decoding
it, then sorting it. a _very_ large win for the 'Ord' instance for
CompactString.

and it is not just files, foreign functions in utf8 locales often take
or return strings as arguments, being able to just call those directly
with the bytestring contents is also a big win. 


        John

-- 
John Meacham - ⑆repetae.net⑆john⑈


More information about the Haskell mailing list