[Haskell-cafe] Re: [Haskell] string type class

Henning Thielemann lemming at henning-thielemann.de
Sun Mar 8 19:12:32 EDT 2009


On Sat, 7 Mar 2009, Duncan Coutts wrote:

> On Fri, 2009-03-06 at 19:16 +0000, Chris Kuklewicz wrote:
>>
>> Not likely.
>>
>> I did define my own (private) class for regular expressions, to abstract over
>> String, the ByteStrings, and Seq Char.  But it is used in one place and is a
>> wart that should be removed.
>>
>> The simple task of looping over the contents of a String (once, forward) is
>> quite different from a Strict ByteString (using an index and a lookup).
>>
>> This means for decent efficiency I need two copies of my code, hand specialized
>> to each case.
>>
>> "tail" or "(x:xs)" : very efficient for String (no allocation)
>> "tail" or "uncons" : not efficient for ByteString (allocation, might as well
>> convert to [Char]
>
> head/tail/uncons on a strict or lazy ByteString is perfectly ok. I would
> recommend it over using length+index. If you are using tail in a tail
> recursion then ghc can almost always optimise it to unbox the various
> ByteString components and so there is no allocation of fresh BS
> constructors, just adjusting offsets and lengths held in registers.

Recently I also used 'ByteString.uncons' in a parser, because I thought 
that it is cheap. Actually the parsing became slower than with String. I 
then used wrappers to strict and lazy ByteStrings, which maintain the 
index of the current reader position. Although they duplicate the 'start' 
index of the internal ByteString records, these wrappers are faster, 
probably because no ByteString record must be allocated in each step.
   http://code.haskell.org/~thielema/tagchup/src/Text/HTML/Tagchup/Parser/Stream.hs


More information about the Haskell-Cafe mailing list