[Haskell-cafe] Attoparsec concatenating combinator

Bryan O'Sullivan bos at serpentine.com
Thu Jun 2 19:33:07 CEST 2011


On Thu, Jun 2, 2011 at 7:02 AM, Yitzchak Gale <gale at sefer.org> wrote:

> It seems the best I can do is to collect them all in a list and then
> apply concat. But that still copies the text several times.
>

Right. I'd like a no-copy combinator for the same reasons, but I think it's
impossible to do without some low-level support.


> If not, does the internal representation easily admit such a combinator?
>

Not very easily. Internally, attoparsec maintains just three pieces of data
for its state:

   - The current input
   - Any input we received via continuations, in case we need to backtrack
   - A flag that denotes whether we were fed EOF by a continuation

There are no line numbers, no "bytes consumed" counters, nothing else. If
there was a "bytes consumed" counter, it would be possible to write a
"try"-like combinator that would hold onto the current input, run a parser,
tack on any input received via continuations to the original input, and then
use the counter to slice off a portion of that bytestring without copying. I
can't think of another way to do it. Adding that counter would be a moderate
amount of work, and would presumably have a negative effect on performance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110602/88624102/attachment.htm>


More information about the Haskell-Cafe mailing list