[Haskell-cafe] Re: bizarre memory usage with data.binary

Felipe Almeida Lessa felipe.lessa at gmail.com
Tue Oct 2 20:37:53 EDT 2007


On 10/2/07, Don Stewart <dons at galois.com> wrote:
> The encode instance for lists is fairly strict:
>
>     instance Binary a => Binary [a] where
>         put l  = put (length l) >> mapM_ put l
>         get    = do n <- get :: Get Int
>                     replicateM n get
>
> This is ok, since typically you aren't serialising infinite structures.
>
> Use a newtype, and a lazier instance, if you need to do this.

Maybe something like this (WARNING: ugly code, as in "not elegant", follows):

newtype List a = List [a]

split255 :: [a] -> (Word8, [a], [a])
split255 = s 0 where
    s 255 xs   = (255, [], xs)
    s n (x:xs) = let (n', ys, zs) = s (n+1) xs in (n', x:ys, zs)
    s n []     = (n, [], [])

instance Binary a => Binary (List a) where
  put (List l) = let (n, xs, ys) = split255 l
                 in do putWord8 n
                       mapM_ put xs
                       when (n == 255) (put $ List ys)
  get = do n <- getWord8
           xs <- replicateM (fromEnum n) get
           if n == 255
             then get >>= \(List ys) -> return (List $ xs ++ ys)
             else return (List xs)


It uses chunks of 255 elements and so doesn't traverse the whole list
until starting to output something. OTOH, there's some data overhead
(1 byte every 255 elements). Seems to run your example fine and in
constant memory.

HTH,

-- 
Felipe.


More information about the Haskell-Cafe mailing list