[Haskell-cafe] hash of a lazy bytestring?

David Roundy droundy at darcs.net
Sun May 13 11:30:41 EDT 2007


I was just contemplating hashes, and it occurred to me that it would be
nice to be able to compute the hash of a lazily constructed bytestring and
lazily consume its output without requiring the whole string to ever be in
memory.  Or in general, it'd be nice to be able to perform two simultaneous
consumptions of a lazy list without requiring that the entire list be
stored.  Another example would be computing the length of an ordinary list:

do l <- readFile "foo"
   let len = length l
   writeFile "bar" l
   putStrLn $ "Length is " ++ show l

It would be nice if one could write length such that the above function
doesn't require the entire file be stored in memory.  Are there any
approaches that would make this possible? The only approach I can imagine
would involve something like weak pointers and finalizers.  It seems like
it'd be a useful paradigm, if it's possible.  Basically to be able to write
a lazy consumer that doesn't hold onto its input, but instead is computed
as the garbage collector frees the input.  Something like:

length' :: [a] -> Integer
length' !xs = l' 0 `safeInterleave` xs
   where l' n [] = n
         l' !n (y:ys) = l' (n+1) `safeInterleave` ys

safeInterleave :: (a -> b) -> a -> b
safeInterleave f !x = -- create weak reference to x, associate finalizer,
                      -- so that if x is GCed, then we'll immediately
                      -- evaluate f x before we lose the value of x.

Is this possible? Would it be reasonable? I imagine we'd run into trouble
with laziness keeping safeInterleave from being called, with the result
that we'd hang onto x even though the safeInterleave would have hoped to
allow x to be GCed.
-- 
David Roundy
http://www.darcs.net


More information about the Haskell-Cafe mailing list