[commit: packages/binary] master: Get: Avoid needless copies of input (7532daa)
git at git.haskell.org
git at git.haskell.org
Sat Feb 4 21:17:25 UTC 2017
Repository : ssh://git@git.haskell.org/binary
On branch : master
Link : http://git.haskell.org/packages/binary.git/commitdiff/7532daa8789e5199109bb1fcde367d71effb07e2
>---------------------------------------------------------------
commit 7532daa8789e5199109bb1fcde367d71effb07e2
Author: Ben Gamari <ben at smart-cactus.org>
Date: Sun May 15 23:55:43 2016 +0200
Get: Avoid needless copies of input
My `b-tree` library seems to tickle a rather pathological behavior in
`binary`'s decoding logic, where `binary` will create many needless
copies of the input buffer by evaluating things of the form `B.concat
[B.empty, leftovers]`, where `leftovers` is large.
This resulted in runtimes of over two minutes when parsing a 50 MByte
file. With this fix run drops to less than 100 milliseconds.
>---------------------------------------------------------------
7532daa8789e5199109bb1fcde367d71effb07e2
src/Data/Binary/Get/Internal.hs | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/Data/Binary/Get/Internal.hs b/src/Data/Binary/Get/Internal.hs
index cf2d012..b9a0818 100644
--- a/src/Data/Binary/Get/Internal.hs
+++ b/src/Data/Binary/Get/Internal.hs
@@ -404,7 +404,11 @@ ensureN !n0 = C $ \inp ks -> do
enoughChunks n str
| B.length str >= n = Right (str,B.empty)
| otherwise = Left (n - B.length str)
- onSucc = B.concat
+ -- Sometimes we will produce leftovers lists of the form [B.empty, nonempty]
+ -- where `nonempty` is a non-empty ByteString. In this case we can avoid a copy
+ -- by simply dropping the empty prefix. In principle ByteString might want
+ -- to gain this optimization as well
+ onSucc = B.concat . dropWhile B.null
onFail bss = C $ \_ _ -> Fail (B.concat bss) "not enough bytes"
{-# INLINE ensureN #-}
More information about the ghc-commits
mailing list