Data.ByteString candidate 3

John Meacham john at repetae.net
Tue Apr 25 16:46:40 EDT 2006


On Tue, Apr 25, 2006 at 02:34:20PM +0100, Simon Marlow wrote:
> Duncan Coutts wrote:
> 
> >How would we distinguish a full fixed0width 4-byte Unicode version?
> 
> Good point, and that's why using the Data.PackedString hierarchy was 
> nice, because it accomodated various different character widths.  I 
> quite like
> 
>   Data.ByteString
>   Data.PackedString.Latin1
>   Data.PackedString.UTF8
>   Data.PackedString.UCS4
>   etc.

Do we really need all of these? UCS4BE? UTF16? if you care intimatly
about the underlying binary representation, then you should be using
ByteString directly, since you are working with binary data. if you just
want a fast string replacement, then you don't care about the internal
representation, you just want it to be fast.

We don't want issues where someones library takes UTF8 strings but
someone elses takes UCS4 strings and you want them to play nice
together.

I think all we really need are

Data.ByteString
Data.PackedString

(Though, I suppose Latin1 could be useful)

but note, do the people that want latin1 just need ASCII? because it should be
noted that if we have a UTF8 PackedString, then we can make
ASCII-specific access routines that are just as fast as the ones in the
Latin1 variety without giving up the ability to store full unicode
values in the string.

        John



-- 
John Meacham - ⑆repetae.net⑆john⑈


More information about the Libraries mailing list