[GHC] #13246: hPutBuf issues unnecessary empty write syscalls for large writes

GHC ghc-devs at haskell.org
Thu Feb 9 00:03:40 UTC 2017


#13246: hPutBuf issues unnecessary empty write syscalls for large writes
-------------------------------------+-------------------------------------
           Reporter:  nh2            |             Owner:
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:  8.2.1
          Component:  Runtime        |           Version:  8.0.2
  System                             |
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  Runtime
  Unknown/Multiple                   |  performance bug
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:  #13245
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 To get good performance, it is better to use few system calls that write
 lots of data in batch.

 I found a bug in hPutBuf that makes this concept not work: When using
 `hPutBuf` with a buffer size greater than 8095 (this number it self is a
 bug, #13245) bytes, two syscalls are issued instead of one: one empty
 `write("")` (a zero-bytes-write, which can't do anything useful), and
 after that the actual useful `write()` of the data.

 Example code:

 {{{
 main = do
   withBinaryFile "testfile2" WriteMode $ \ hTo -> do
     let bufferSize = 8096
     allocaBytes bufferSize $ \buffer -> do
       Main.hPutBuf hTo buffer bufferSize

 }}}

 In `strace -f -T` on the compiled binary, we see the syscalls:

 {{{
 write(3, "", 0)                         = 0 <0.000004>
 write(3,
 "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
 8096) = 8096 <0.000017>
 }}}

 As you can see in the timings, this also has a fairly large performance
 overhead (20% in this case).

 When using `bufferSize = 8095`, the `write("")` disappears.

 The problem is this code for `bufWrite` (called by `hPutBuf`):

 {{{
 bufWrite :: Handle__-> Ptr Word8 -> Int -> Bool -> IO Int
 bufWrite h_ at Handle__{..} ptr count can_block =
   seq count $ do  -- strictness hack
   old_buf at Buffer{ bufRaw=old_raw, bufR=w, bufSize=size }
      <- readIORef haByteBuffer

   -- enough room in handle buffer?
   hPutStrLn System.IO.stderr (show (size, w, count))
   if (size - w > count)
         -- There's enough room in the buffer:
         -- just copy the data in and update bufR.
         then do debugIO ("hPutBuf: copying to buffer, w=" ++ show w)
                 copyToRawBuffer old_raw w ptr count
                 writeIORef haByteBuffer old_buf{ bufR = w + count }
                 return count

         -- else, we have to flush
         else do debugIO "hPutBuf: flushing first"
                 old_buf' <- Buffered.flushWriteBuffer haDevice old_buf
                         -- TODO: we should do a non-blocking flush here
                 writeIORef haByteBuffer old_buf'
                 -- if we can fit in the buffer, then just loop
                 if count < size
                    then bufWrite h_ ptr count can_block
                    else if can_block
                            then do writeChunk h_ (castPtr ptr) count
                                    return count
                            else writeChunkNonBlocking h_ (castPtr ptr)
 count
 }}}

 The check `if (size - w > count)` should be `if (size - w >= count)`
 instead, because we can do the write all fine if it fits exactly.

 In the adversarial case, `size - w == count`, we go into the `hPutBuf:
 flushing first` branch, thus emitting the `write("")`.

 See
 https://github.com/ghc/ghc/blame/876b00ba25a615423f48b0cf9d443a9fd5dbd6f4/libraries/base/GHC/IO/Handle/Text.hs#L740
 for the full code.

 Simon Marlow has confirmed this on IRC, I'll submit a patch for it that
 switches to `>=`.

 It would be nice if the fix could be released in both GHC 8.2 and and
 8.0.3.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13246>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list