[commit: ghc] master: bufWrite: Save extra syscall when data fills handle buffer completely. (6b4e46a)

git at git.haskell.org git at git.haskell.org
Sun Feb 12 01:09:13 UTC 2017


Repository : ssh://git@git.haskell.org/ghc

On branch  : master
Link       : http://ghc.haskell.org/trac/ghc/changeset/6b4e46a1ad57f90726de7e5b7f59814aea5f03ef/ghc

>---------------------------------------------------------------

commit 6b4e46a1ad57f90726de7e5b7f59814aea5f03ef
Author: Niklas Hambüchen <mail at nh2.me>
Date:   Sat Feb 11 19:24:49 2017 -0500

    bufWrite: Save extra syscall when data fills handle buffer completely.
    
    The bug is that the check `if (size - w > count)` should be
    `if (size - w >= count)` instead (`>=` instead of `>`),
    because we can do the write all fine if it fits exactly.
    This allows us to do 1 instead of 2 write syscalls the case it fits.
    
    An example of when this matters is when an application writes output
    in chunks that are a fraction of the handle buffer size.
    
    For example, assume the buffer size is 8 KB, and the application writes
    four 2 KB chunks.
    Until now, this would result in 3 copies to the handle buffer, but
    the 4th one would not be allowed in by `size - w > count`
    (because `size - w == count` is the case), so we'd end up with a write
    syscall of only 6 KB data instead of 8 KB, thus creating more syscalls
    overall.
    
    Implementing this fix (switching to `size - w >= count`), we also have
    to flush the buffer if we fill it completely.
    
    If we made only the changes described so far, that would have the
    unintended side effect that writes of the size equal to the handle
    buffer size (`count == size`) suddenly also go to the handle buffer
    first: The data would first be copied to the handle buffer, and then
    immediately get flushed to the underlying FD.  We don't want that extra
    `memcpy`, because it'd be unnecessary: The point of handle buffering is
    to coalesce smaller writes, and there are no smaller writes in this
    case.  For example, if you specify 8 KB buffers (which menas you want
    your data to be written out in 8 KB blocks), and you get data that's
    already 8 KB in size, you can write that out as an 8 KB straight away,
    zero-copy fashion.  For this reason, adding to the handle buffer now got
    an additional condition `count < size`.  That way, writes equal to the
    buffer size go straight to the FD, as they did before this commit.
    
    Reviewers: simonmar, austin, hvr, bgamari
    
    Reviewed By: simonmar
    
    Subscribers: mpickering, thomie
    
    Differential Revision: https://phabricator.haskell.org/D3117


>---------------------------------------------------------------

6b4e46a1ad57f90726de7e5b7f59814aea5f03ef
 libraries/base/GHC/IO/Handle/Text.hs | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/libraries/base/GHC/IO/Handle/Text.hs b/libraries/base/GHC/IO/Handle/Text.hs
index c53b429..8d92738 100644
--- a/libraries/base/GHC/IO/Handle/Text.hs
+++ b/libraries/base/GHC/IO/Handle/Text.hs
@@ -736,13 +736,30 @@ bufWrite h_ at Handle__{..} ptr count can_block =
   old_buf at Buffer{ bufRaw=old_raw, bufR=w, bufSize=size }
      <- readIORef haByteBuffer
 
-  -- enough room in handle buffer?
-  if (size - w > count)
-        -- There's enough room in the buffer:
+  -- TODO: Possible optimisation:
+  --       If we know that `w + count > size`, we should write both the
+  --       handle buffer and the `ptr` in a single `writev()` syscall.
+
+  -- Need to buffer and enough room in handle buffer?
+  -- There's no need to buffer if the data to be written is larger than
+  -- the handle buffer (`count >= size`).
+  if (count < size && count <= size - w)
+        -- We need to buffer and there's enough room in the buffer:
         -- just copy the data in and update bufR.
         then do debugIO ("hPutBuf: copying to buffer, w=" ++ show w)
                 copyToRawBuffer old_raw w ptr count
-                writeIORef haByteBuffer old_buf{ bufR = w + count }
+                let copied_buf = old_buf{ bufR = w + count }
+                -- If the write filled the buffer completely, we need to flush,
+                -- to maintain the "INVARIANTS on Buffers" from
+                -- GHC.IO.Buffer.checkBuffer: "a write buffer is never full".
+                if (count == size - w)
+                  then do
+                    debugIO "hPutBuf: flushing full buffer after writing"
+                    flushed_buf <- Buffered.flushWriteBuffer haDevice copied_buf
+                            -- TODO: we should do a non-blocking flush here
+                    writeIORef haByteBuffer flushed_buf
+                  else do
+                    writeIORef haByteBuffer copied_buf
                 return count
 
         -- else, we have to flush any existing handle buffer data



More information about the ghc-commits mailing list