[Haskell-cafe] Attoparsec concatenating combinator

Simon Meier iridcode at gmail.com
Tue Jun 7 10:40:18 CEST 2011


2011/6/6 Bryan O'Sullivan <bos at serpentine.com>:
> On Sun, Jun 5, 2011 at 11:00 AM, Yitzchak Gale <gale at sefer.org> wrote:
>>
>> If behind the scenes the concat is copying directly from slices of the
>> original
>> input, then no, in principle we're not saving much then.
>> I thought there were *two* copies going on.
>
> If you're using the specialised functions like attoparsec's takeWhile, then
> all they do is return a view into the underlying array. No copying occurs
> until the concat itself. Now that I think of it: in principle, you could
> write a specialised concat that would check the pointer/offset/length
> combinations of its arguments and, if they all abutted perfectly, would just
> return a new view into that same array, sans copying. (You'd have to hide it
> behind unsafePerformIO, of course.)

Why would you need 'unsafePerformIO'. You can scrutinise the 'PS'
constructors of the slice without dropping down to IO. The required
'Eq' instance on 'ForeignPtr' is also present.

Using a Builder for concatentation makes sense, if you want to exploit
that copying a slice of the input array is cheaper right after it has
been inspected (its fully cached) than later (as it is done when
collecting slices in a list). However, you can only have one Builder
at a time and some low-level meddling is probably required to
interleave the feeding of the Parser with input arrays with the
feeding of the Builder with free buffers. Nevertheless, for something
like parsing Chunked HTTP content it would make a lot of sense. I'm
inclined look into that once I finished porting the blaze Builder to
the 'bytestring' library.



More information about the Haskell-Cafe mailing list