Adding binary to the Haskell Platform
kr.angelov at gmail.com
Thu Aug 6 06:49:02 EDT 2009
Ok. I tried to replace my hacked version of Data.Binary with the
latest version of the library. Indeed the stack overflow error is
fixed. I did some further experiments:
1. After upgrading I tried to compare the two implementations. First I
was surprised to find that my version of binary is 3-4 times slower
than the latest official version. After some experiments I found that
the problem was in my version of Data.Binary.Get.getWord8. In the
official release it is:
getWord8 :: Get Word8
getWord8 = getPtr (sizeOf (undefined :: Word8))
while in my version it was:
getWord8 = do
S s ss bytes <- get
case B.uncons s of
Just (w,rest) -> do put $! S rest ss (bytes + 1)
return $! w
Nothing -> case L.uncons ss of
Just (w,rest) -> do put $! mkState rest (bytes + 1)
return $! w
Nothing -> fail "too few bytes"
I don't remember why I had changed it but probably it was an attempt
to make it faster since I use getWord8 often. I don't know what had
changed but now fist version is much much faster. When I reverted to
using the fist version my library become as fast as the official
2. I tried to revert the implementation of the Get monad from strict
to lazy. This made the decoding even faster - from 1.52 sec to 1.08
sec for ~ 3 Mb of data. Good achievement.
3. After the above changes the only differences between my version and
the official version are that I have different instance for Int. The
impact is that with my instance the output is from 2 to 4 times more
compact. As a consequence the decoding is also faster with about 50%.
I know that I can use compression to reduce the size of the output but
this will make the deserialization only slower, not faster. Why it is
so important to store Int as Int64 instead of as variable bytes field?
This only adds extra overhead.
More information about the Libraries