Proposal: add ByteString support to unix:System.Posix.IO API

wren ng thornton wren at freegeek.org
Wed Mar 2 04:23:56 CET 2011


On 3/1/11 1:35 AM, Bryan O'Sullivan wrote:
> I want to see four entry points for writing:
>
> fdWrite :: Strict.ByteString ->  IO Int
> fdWriteAll :: Strict.ByteString ->  IO ()
> fdWritev :: [Strict.ByteString] ->  IO Int -- turn the list into an iovec,
> then call writev
> fdWritevAll :: [Strict.ByteString] ->  IO ()

Using writev requires the length of the list in order to get a count of 
chunks, which forces us to hold the whole list/lazy-bytestring in memory 
at once and also adds O(n) time for traversing it. Also it'd require 
converting each of the ByteString structs into iovec structs (whereas 
using write allows this to be unpacked into the call frames for write).

What's the benefit of doing this? Is writev that much more efficient 
than Haskell code with the same semantics[1]?


> People would normally use the 'All' variants, but there are times when you
> really do want to know if you've performed a short write so that you can
> handle it yourself.

What are the desired semantics for the All variants? Should it retry, or 
fail? How many times should it retry? etc.

For manual recovery from partial writes, it would be better to have the 
basic function be:

     fdWriteFoo
         :: [Strict.ByteString] -- or Lazy.ByteString, whichever
         -> IO
             -- The total count of bytes written
             ( ByteCount
             -- The remaining content, with the first chunk already
             -- accounting for the last partial write (by adjusting
             -- the ByteString's offset).
             , [Strict.ByteString]
             )

So that the head of the lazy bytestring can be garbage collected and so 
you don't have to traverse it again to figure out where printing left off.

Or actually, we'd want:

     fdWriteBar
         :: [Strict.ByteString]
         -> IO
             -- The total count of bytes written
             ( ByteCount
             -- The count of bytes written from the first chunk of
             -- the remaining content.
             , ByteCount
             -- The remaining content, with the first chunk not
             -- accounting for the last partial write (use the second
             -- ByteCount to account for it).
             , [Strict.ByteString]
             )

When using a lazy bytestring for the input then this latter version 
would be silly. However, with a list of bytestrings, there can be 
semantics encoded into how the string is chopped up and we shouldn't 
corrupt that information by adjusting any of the chunks. This is one 
reason why using lazy bytestrings gives cleaner semantics.



[1]
fdWrite :: Fd -> BL.ByteString -> IO ByteCount
fdWrite fd = go 0
     where
     -- We want to do a left fold in order to avoid stack overflows,
     -- but we need to have an early exit for incomplete writes
     -- (which normally requires a right fold). Hence this recursion.
     go acc BLI.Empty        = return acc
     go acc (BLI.Chunk c cs) = do
         rc <- PosixBS.fdWrite fd c
         let acc' = acc+rc in acc' `seq` do
         if rc == fromIntegral (BS.length c)
             then go acc' cs
             else return acc'

-- 
Live well,
~wren



More information about the Libraries mailing list