[Haskell-cafe] Data.Binary, strict reading

Bertram Felgenhauer bertram.felgenhauer at googlemail.com
Thu Feb 26 02:07:20 EST 2009


Neil Mitchell wrote:
> Hi,
> 
> I want to read a file using Data.Binary, and I want to read the file
> strictly - i.e. when I leave the read file I want to guarantee the
> handle is closed. The reason is that (possibly immediately after) I
> need to write to the file. The following is the magic I need to use -
> is it all necessary, is it guaranteed correct, should I use something
> else?
> 
>         src <- decodeFile "_make/_make"
>         Map.size mp `seq` performGC

With binary 0.5,

    src <- decodeFile "_make/_make"
    return $! src

should close the file, assuming that all the data is read from the file,
thanks to this patch:

  Mon Aug 25 23:01:09 CEST 2008  Don Stewart <dons at galois.com>
    * WHNF the tail of a bytestring on decodeFile, will close the resource

For older versions,

    import qualified Data.Binary.Get as Get

    data EOF = EOF
    instance Binary EOF where
        get = do
           eof <- Get.isEmpty
           return (if eof then EOF else error "EOF expected")
        put EOF = return ()

    ...
    (src, EOF) <- decodeFile "_make/_make"

accomplishes the same effect.

Btw, contrary to what Duncan said, Get is a lazy monad (lazy in its
actions, that is):

    instance Binary EOF where
        get = do
           eof <- Get.isEmpty
           when (not eof) error "EOF expected"
           return EOF
        put EOF = return ()

does not help, because the result (EOF) does not depend on the value
returned by isEmpty.

The idea of using isEmpty for closing the file is not perfect though;
due to the lazy nature of Get, there's a stack overflow lurking below:

    main = do
        encodeFile "w.bin" [0..1000000 :: Int]
        m <- decodeFile "w.bin"
        print $ foldl' (+) 0 (m :: [Int])

One idea to fix this is to force the read data before checking for EOF,
as follows:

   data BinaryRNF a = BinaryRNF a

   instance (NFData a, Binary a) => Binary (BinaryRNF a) where
       get = (\a -> rnf a `seq` BinaryRNF a) `fmap` get
       put (BinaryRNF a) = put a

   main = do
       encodeFile "w.bin" [0..1000000 :: Int]
       (BinaryRNF m, EOF) <- decode `fmap` L.readFile "w.bin"
       print $ foldl' (+) 0 (m :: [Int])

HTH,

Bertram


More information about the Haskell-Cafe mailing list