Haskell Platform Proposal: add the 'text' library
wren ng thornton
wren at community.haskell.org
Fri Sep 10 17:57:05 EDT 2010
On 9/10/10 5:18 PM, Bryan O'Sullivan wrote:
> On Fri, Sep 10, 2010 at 2:03 PM, wren ng thornton<
> wren at community.haskell.org> wrote:
>> Yes, that was my point. I can see uses for (Text->...),
>> ((Text->Bool)->...), and ((Char->Bool)->...) but the middle one ---which
>> seems to be the closest analogue to String and ByteString--- is missing. The
>> first one is posited as a replacement for the middle one, but it is
>> insufficient since it cannot perform disjunctive searches.
> I don't think anyone posited it as a replacement for the middle one?
You did, kinda :) More specifically, you proposed (Text->...) as the
analogue for ((Char->Bool)->...) and ((Char8->Bool)->...)
> could replace Char->Bool with Text->Bool, but it would be slower (and yes,
> that matters to me). I don't intend to add it myself, but you're welcome to
> put together a patch and a set of QuickCheck tests.
>> Why do we not just have the middle ((Text->Bool)->...) option?
> Because you can't do a Boyer-Moore search off it.
I'm fine with the performance argument, I'm just pointing out why I see
the API as inconsistent with the String/ByteString APIs. Since the break
function for String/ByteString is rather entrenched as being a method
for breaking via a single character, a function that uses Boyer--Moore
to break on a string (not just strings required by the mismatch between
a "character" and a Char) doesn't seem like the analogous function. I
think it's much closer to breakSubstring than it is to break. Whether
break/breakSubstring or breakBy/break is the better set of names, that's
a different bike shed.
As for ((Text->Bool)->...) vs ((Char->Bool)->...), I pointed it out
because you've mentioned the discrepancy between Char and "characters".
In practice, I'd expect that the majority of characters that people wish
to break on are indeed Chars, so performance wins out in API design.
However, there's no mention of the discrepancy in the documentation,
which I think is an oversight.
More information about the Libraries