getLazyByteString network I/O Bug

Lennart Kolmodin kolmodin at dtek.chalmers.se
Thu May 1 14:33:04 EDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello!

Right, so it comes down to how binary handles chunks of the ByteString
that might not yet be available.
We have do decide what is reasonable to handle, and what could possibly
trigger an error.

splitAtST is too strict in the #ifndef BYTESTRING_IN_BASE version. Why
the versions differ? I don't know.
Even if it did not need any more data it forces the next chunk. This is
easy to fix. This solves the problem when you ask for some data which is
available, but the data after that is not. As I don't fully understand
the splitAtST solution I can't say which splitAtST version is buggy.

Then reading the last available data would create a state
~  S B.empty lbs x    where lbs is an unevaluated lazy bytestring.
~                           All is well as lbs remains unevaluated
Curt, this is probably the bug you ran into.


However, another issue; if you run any action and ask for 0 bytes;
~   like
~     * getByteString 0, or
~     * getLazyByteString 0
and no bytes are available, it would still block.

Like so:
~    runGet (getByteString 0) (...blocking lazy bytestring...)

This is due to that creating the state (mkState) forces the first chunk.

152 mkState :: L.ByteString -> Int64 -> S
153 mkState l = case l of
154     L.Empty      -> S B.empty L.empty
155     L.Chunk x xs -> S x xs

Is it acceptable to block in this case?

By making splitAtST not force its last argument, let getLazyByteString
and getBytes return an empty string immediately when 0 bytes are
requested, we solve the three issues.

Too hackish? Would anyone like to shed some light on why splitAtST
forced the last argument?

Cheers,
~  Lennart Kolmodin

Curt Sampson wrote:
| dcoutts on #haskell suggested that a difference in behaviour
| between Data.Binary's getByteString and getLazyByteString was a
| bug. Basically, in my particular case I send something to the
| server (or for the purposes of this test, just connect) and it
| sends back a non-newline-terminated response. If I pull it from the
| getContents of the handle using getByteString, it works fine; if I use
| getLazyByteString, it blocks, even though it has all of the data it
| needs to satisfy the request.
|
| BTW, I did try setting the mode of the handle to non-blocking, but it
| didn't make any difference in either case.
|
| I've attached a small project that displays this difference; see the
| README at the top level to see how to try it for yourself.
|
| Incidently, I'm ghc 6.8.2 on NetBSD 4, and of course using the version
| of Data.Binary that's included in the project, 0.4.1.
|
| Also, let me know if this is about the right amount of work for a test
| case for something like this. It took me about an hour to put together,
| so if you didn't really need all of this, I can probably send you less
| next time.
|
| cjs

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgaDOAACgkQ4txYG4KUCuEZ3ACfYJIPl/hKYoyudojK9VX5KTBe
kjMAn3jFY0XnZdT7jlOZsJ004q0y6aIo
=eG2B
-----END PGP SIGNATURE-----


More information about the Libraries mailing list