Haskell Platform Proposal: add the 'text' library

Duncan Coutts duncan.coutts at googlemail.com
Wed Sep 8 06:05:38 EDT 2010


On 8 September 2010 10:56, Johan Tibell <johan.tibell at gmail.com> wrote:

> The lazy version of Text uses one more word per value than the strict
> version. This can be significant for small strings (e.g. ~8 characters)
> where the overhead per character already is quite high. If I counted the
> size of the BA# constructor correctly, a strict Text has a fixed overhead of
> 7 words and a lazy Text has an overhead of 8 words. This matters when you
> e.g. want to use Texts as keys in a Map.

Ah, well if we're playing that game then I have a representation where
lazy uses the same storage as strict. :-)

The trick is to save a word by using smaller length and offset fields
(e.g. 16bit). That can be done for lazy but not strict because with
lazy you can always break long strings into multiple 2^16 sized chunks
whereas for strict it's essential to be able to use 32/64 bit
length/offsets.

> Btw, I see that the BA# constructor is not manually unpacked into the Array
> data type. Is that done automatically since ByteArray# is unlifted or is
> there some room for improvement here?

I'm not sure what you're referring to here, the definition is:

data UArray i e = UArray !i !i !Int ByteArray#

The ByteArray# is an unlifted type (but its representation is a
pointer to a heap object).

Duncan


More information about the Libraries mailing list