Haskell Platform Proposal: add the 'text' library

Wed Sep 8 05:56:20 EDT 2010

On Wed, Sep 8, 2010 at 12:21 AM, Duncan Coutts <duncan.coutts at googlemail.com
> wrote:

>  On 7 September 2010 22:50, Ian Lynagh <igloo at earth.li> wrote:> Are there
> cases when Data.Text is significantly faster than
> > Data.Text.Lazy? Do we need both? (Presumably .Lazy is built on top of
> > Data.Text, but do we need the user to have a complete interface for
> > both?)
>
> Mm, this is a fair question. In the case of bytestring we need both
> because sometimes for dealing with foreign code or IO you need the
> representation to be a contigious block of memory. For text the
> representation is more abstract so that need does not arrise. One
> might argue that if it is simply to control strictness then one could
> use the lazy version and provide a deepseq instance.
>
> Here's an alternative argument: suppose we change the representation
> of strict text to be a tree of chunks (e.g. finger tree). We could
> achieve effecient concatenation. This representation would be
> impossible while preserving semantics of a lazy tail. A tree impl that
> has any kind of balance needs to know the overall length so cannot
> have a lazy tail.
>

The lazy version of Text uses one more word per value than the strict
version. This can be significant for small strings (e.g. ~8 characters)
where the overhead per character already is quite high. If I counted the
size of the BA# constructor correctly, a strict Text has a fixed overhead of
7 words and a lazy Text has an overhead of 8 words. This matters when you
e.g. want to use Texts as keys in a Map.

Btw, I see that the BA# constructor is not manually unpacked into the Array
data type. Is that done automatically since ByteArray# is unlifted or is
there some room for improvement here?

Cheers,
Johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20100908/0fdc88ec/attachment.html