Haskell Platform Proposal: add the 'text' library

wren ng thornton wren at community.haskell.org
Fri Sep 10 17:57:05 EDT 2010


On 9/10/10 5:18 PM, Bryan O'Sullivan wrote:
> On Fri, Sep 10, 2010 at 2:03 PM, wren ng thornton<
> wren at community.haskell.org>  wrote:
>
>> Yes, that was my point. I can see uses for (Text->...),
>> ((Text->Bool)->...), and ((Char->Bool)->...) but the middle one ---which
>> seems to be the closest analogue to String and ByteString--- is missing. The
>> first one is posited as a replacement for the middle one, but it is
>> insufficient since it cannot perform disjunctive searches.
>>
>
> I don't think anyone posited it as a replacement for the middle one?

You did, kinda :) More specifically, you proposed (Text->...) as the 
analogue for ((Char->Bool)->...) and ((Char8->Bool)->...)

> We
> could replace Char->Bool with Text->Bool, but it would be slower (and yes,
> that matters to me). I don't intend to add it myself, but you're welcome to
> put together a patch and a set of QuickCheck tests.
>
>> Why do we not just have the middle ((Text->Bool)->...) option?
>
> Because you can't do a Boyer-Moore search off it.

I'm fine with the performance argument, I'm just pointing out why I see 
the API as inconsistent with the String/ByteString APIs. Since the break 
function for String/ByteString is rather entrenched as being a method 
for breaking via a single character, a function that uses Boyer--Moore 
to break on a string (not just strings required by the mismatch between 
a "character" and a Char) doesn't seem like the analogous function. I 
think it's much closer to breakSubstring than it is to break. Whether 
break/breakSubstring or breakBy/break is the better set of names, that's 
a different bike shed.

As for ((Text->Bool)->...) vs ((Char->Bool)->...), I pointed it out 
because you've mentioned the discrepancy between Char and "characters". 
In practice, I'd expect that the majority of characters that people wish 
to break on are indeed Chars, so performance wins out in API design. 
However, there's no mention of the discrepancy in the documentation, 
which I think is an oversight.

-- 
Live well,
~wren


More information about the Libraries mailing list