HEAD: Deterioration in ByteString I/O

Daniel Fischer daniel.is.fischer at web.de
Wed Sep 8 21:48:09 EDT 2010


On Thursday 09 September 2010 01:28:04, Daniel Fischer wrote:
> Maybe the following observation helps:
>
> ghc-6.13.20100831 reads lazy ByteStrings in chunks of 8192 bytes.
>
> If I understand correctly, that means (since defaultChunkSize = 32760)
> - bytestring allocates a 32K buffer to be filled and asks ghc for 32760
> bytes in that buffer
> - ghc asks the OS for 8192 bytes (and usually gets them)
> - upon receiving fewer bytes than requested, bytestring copies them to a
> new smaller buffer
> - since the number of bytes received is a multiple of ghc's allocation
> block size (which I believe is 4K), there's no space for the bookkeeping
> overhead, hence the new buffer takes up 12K instead of 8, resulting in
> 44K allocation for 8K bytes
>
> That factor of 5.5 corresponds pretty well with the allocation figures
> above,

That seems to be correct, but probably not the whole story.
I've played with defaultChunkSize, setting it to (64K - overhead), ghc 
still reads in 8192 byte chunks, the allocation figures are nearly double 
those for (32K - overhead). Setting it to (8K - overhead), ghc reads in 
8184 byte chunks and the allocation figures go down to approximately 1.4 
times those of 6.12.3.
Can a factor of 1.4 be explained by the smaller chunk size or is something 
else going on?

> and the extra copying explains the approximate doubling of I/O time.

Apparently not. With the small chunk size which should avoid copying, the 
I/O didn't get faster.

>
> Trying to find out why ghc asks the OS for only 8192 bytes instead of
> 32760 hasn't brought enlightenment yet.

No progress on that front.



More information about the Glasgow-haskell-users mailing list