[Haskell-cafe] data.binary get reading beyond end of input bytestring?

Duncan Coutts duncan.coutts at googlemail.com
Thu Jul 29 09:15:16 EDT 2010


On Thu, 2010-07-29 at 19:17 +0900, Conrad Parker wrote:
> On 29 July 2010 19:13, Duncan Coutts <duncan.coutts at googlemail.com> wrote:
> > On Thu, 2010-07-29 at 19:01 +0900, Conrad Parker wrote:
> >> On 29 July 2010 17:46, Duncan Coutts <duncan.coutts at googlemail.com> wrote:
> >> > On 29 July 2010 07:53, Conrad Parker <conrad at metadecks.org> wrote:
> >> >
> >> >>> Something smells fishy here. I have a hard time believing that binary is
> >> >>> reading more input than is available? Could you post more code please?
> >> >>
> >> >> The issue seems to just be the return value for "bytes consumed" from
> >> >> getLazyByteString. Here's a small example.
> >> >
> >> > http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Get.html#v%3AgetLazyByteString
> >> >
> >> > getLazyByteString :: Int64 -> Get ByteString
> >> > An efficient get method for lazy ByteStrings. Does not fail if fewer
> >> > than n bytes are left in the input.
> >> >
> >> >
> >> > Because it does it lazily it cannot check if it's gone past the end of
> >> > the input. Arguably this is crazy and the function should not exist.
> >>
> >> cheers Duncan, that confirms my guess about the reason. Would you
> >> accept a patch quoting you on that last point to the comment? ;-)
> >
> > The consensus plan amongst the binary hackers is to eliminate lazy
> > lookahead functions and to rebuild binary on top of a continuation style
> > using strict chunks (then with lazy decoding built on top).
> 
> I'll take that as a no on the patch.

Oh, sorry, documentation patch is fine.

> How would that plan differ from having an iteratee version of
> data.binary? ie. something that is easily compatible with
> WrappedByteString, as the existing Data.Binary is easily compatible
> with Data.ByteString.Lazy?

No idea what WrappedByteString is.

It would look like attoparsec's resumable parser:

data Result a = Fail !ByteString
              | Partial (ByteString -> Result a)
              | Done !ByteString a

runGet :: Get a -> ByteString -> Result a

Point is you feed it strict bytestring chunks. Then decoding a lazy
bytestring can be implemented on top easily, as can decoding a sequence
lazily.

I imagine you could fairly easily interface it with iteratee too.

Duncan



More information about the Haskell-Cafe mailing list