ByteString I/O Performance
Peter Simons
simons at cryp.to
Wed Aug 29 16:05:02 EDT 2007
> import System.IO
> import Foreign ( allocaBytes )
> import qualified Data.ByteString as Str
> bufsize :: Int
> bufsize = 4 * 1024
In order to determine I/O performance, a random 512 MB file is copied
from standard input to standard output. All test programs have been
compiled with GHC 6.6.1 using "-O2 -funbox-strict-fields" for
optimization. The time to beat for this test comes from /bin/cat:
$ dd if=/dev/urandom of=test.data bs=1M count=512
$ time /bin/cat <test.data >/dev/null
real 0m2.097s 0m2.135s 0m2.100s
user 0m0.036s 0m0.028s 0m0.024s
sys 0m2.060s 0m2.108s 0m2.076s
The first entry is implemented using static buffer I/O:
> catBuf :: Handle -> Handle -> IO ()
> catBuf hIn hOut = allocaBytes bufsize input
> where
> input ptr = hGetBuf hIn ptr bufsize >>= output ptr
> output _ 0 = return ()
> output ptr n = hPutBuf hOut ptr n >> input ptr
real 0m2.747s 0m2.737s 0m2.758s
user 0m0.524s 0m0.416s 0m0.632s
sys 0m2.224s 0m2.304s 0m2.124s
The second entry is implemented with ByteString:
> catString :: Handle -> Handle -> IO ()
> catString hIn hOut = Str.hGet hIn bufsize >>= loop
> where
> loop buf | Str.null buf = return ()
> | otherwise = Str.hPut hOut buf >> catString hIn hOut
real 0m7.852s 0m7.817s 0m7.887s
user 0m4.764s 0m4.800s 0m4.748s
sys 0m3.080s 0m3.000s 0m3.108s
When Data.ByteString.Char8 is used instead, the program produces almost
identical results. Data.ByteString.Lazy, however, came out differently:
real 0m8.184s 0m8.086s 0m8.067s
user 0m5.104s 0m5.252s 0m4.948s
sys 0m2.940s 0m2.808s 0m3.120s
ByteString turns out to be more than two times slower than ordinary
buffer I/O. This result comes as a surprise because ByteString _is_ an
ordinary memory buffer, so it feels reasonable to expected it to perform
about the same. The reason why ByteString cannot compete with hGetBuf
appears to be Data.ByteString.Base.createAndTrim. That function
allocates space with malloc(), reads data into that buffer, allocates a
new buffer, and then copies the data it has just read from the old
buffer into the new one before returning it. This approach is quite
inefficient for reading large amounts of data.
It is particularly odd that Data.ByteString.readFile relies on the same
mechanism. The required buffer size is known in advance. There is no
point in reading data into a temporary buffer. I may have misread the
implementation, but my impression is that readFile currently requires
2*n bytes of memory to read a file of size n.
It feels like there is plenty of room for optimization. :-)
> main :: IO ()
> main = do
> mapM_ (\h -> hSetBuffering h NoBuffering) [ stdin, stdout ]
> catString stdin stdout
More information about the Libraries
mailing list