[Haskell-beginners] Lazy list Binary instance

uu1101 at gmail.com uu1101 at gmail.com
Mon Nov 15 19:42:16 EST 2010


Dear haskellers,

I need to collect data to a file through a long running computation  
(the population of each generation in an genetic algorithm) and  
afterwards read the file and compute some statistics in a single pass.

I'm using a Writer monad with Data.DList as its Monoid, telling it  
'DList.singleton population', at each generation and 'DList.toList .  
execWriter ...' at the end. Do you think this is the right approach?

As the amount of data is very large I'd like to Data.Binary.serialize  
and Codec.ZLib.compress when writing to the file, and the opposite  
when reading.

As it happens the default list Binary instance does not serve my  
purpose, as it needs to retain the whole list prior to writing the file.

I tried implementing a different Binary instance for the list wrapping  
it in a newtype, but without success so far. This issue has been  
discussed before so I tried using Felipe's code [0] and some variations.

newtype List a = List [a]

split255 :: [a] -> (Word8, [a], [a])
split255 = s 0 where
   s 255 xs   = (255, [], xs)
   s n (x:xs) = let (n', ys, zs) = s (n+1) xs in (n', x:ys, zs)
   s n []     = (n, [], [])

instance Binary a => Binary (List a) where
put (List l) = let (n, xs, ys) = split255 l
                in do putWord8 n
                      mapM_ put xs
                      when (n == 255) (put $ List ys)
get = do n <- getWord8
          xs <- replicateM (fromEnum n) get
          if n == 255
            then get >>= \(List ys) -> return (List $ xs ++ ys)
            else return (List xs)


But in this case the Binary.decode needed to read the whole file prior  
to accessing the first element.

Why does this happen?
How can I both write and read the file lazily?

My guess is that in the line:

then get >>= \(List ys) -> return (List $ xs ++ ys)

the pattern in the lambda forces get to retrieve the next chunk, which  
in turn forces the next... until the end is reached.

Could this be solved by making it a irrefutable pattern? (I'm sorry, I  
tried different variations this afternoon and only now that I can't  
try it this occurs to me).

Best regards,
Adrian.

[0] http://www.mail-archive.com/haskell-cafe@haskell.org/msg30761.html


More information about the Beginners mailing list