[Haskell-cafe] data.binary get reading beyond end of input bytestring?

Conrad Parker conrad at metadecks.org
Wed Jul 28 04:05:55 EDT 2010


Hi,

I am reading data from a file as strict bytestrings and processing
them in an iteratee. As the parsing code uses Data.Binary, the
strict bytestrings are then converted to lazy bytestrings (using
fromWrap which Gregory Collins posted here in January:

-- | wrapped bytestring -> lazy bytestring
fromWrap :: I.WrappedByteString Word8 -> L.ByteString
fromWrap = L.fromChunks . (:[]) . I.unWrap

). The parsing is then done with the library function
Data.Binary.Get.runGetState:

-- | Run the Get monad applies a 'get'-based parser on the input
-- ByteString. Additional to the result of get it returns the number of
-- consumed bytes and the rest of the input.
runGetState :: Get a -> L.ByteString -> Int64 -> (a, L.ByteString, Int64)

The issue I am seeing is that runGetState consumes more bytes than the
length of the input bytestring, while reporting an
apparently successful get (ie. it does not call error/fail). I was
able to work around this by checking if the bytes consumed > input
length, and if so to ignore the result of get and simply prepend the
input bytestring to the next chunk in the continuation.

However I am curious as to why this apparent lack of bounds checking
happens. My guess is that Get does not check the length of the input
bytestring, perhaps to avoid forcing lazy bytestring inputs; does that
make sense?

Would a better long-term solution be to use a strict-bytestring binary
parser (like cereal)? So far I've avoided that as there is
not yet a corresponding ieee754 parser.

Conrad.


More information about the Haskell-Cafe mailing list