[Haskell-cafe] two problems with Data.Binary and Data.ByteString
dons at galois.com
Tue Aug 12 20:18:20 EDT 2008
> I have a program that read in and populated a large data structure and
> then saved it out with Data.Binary and Data.ByteString.Lazy.Char8:
> saveState db = B.writeFile stateFile =<<
> encode <$> atomically (readTVar db)
> when I go to read this in later I get a stack overflow:
> loadState db = do
> d <- decode <$> B.readFile stateFile
> atomically $ writeTVar db d
> Stack space overflow: current size 8388608 bytes.
> Use `+RTS -Ksize' to increase it.
> or from ghci:
> d <- liftM decode
> "savedState.bin") :: IO InstrsDb
> fromList *** Exception: stack overflow
> The data type I'm storing is a Map (of maps):
> type DailyDb = M.Map Date Daily
> type InstrsDb = M.Map String DailyDb
> What's going on here? Why is the system capable of building and saving
> the data but not in reading and umarhsalling it? What is the proper way
> to track down where the exception is happening? Any debugging tips?
So a big Map is serialised as a huge list.
instance (Ord k, Binary k, Binary e) => Binary (Map.Map k e) where
put = put . Map.toAscList
get = liftM Map.fromDistinctAscList get
so that fromAscList's the result of parsing the map as a list, via,
instance Binary a => Binary [a] where
put l = put (length l) >> mapM_ put l
get = do n <- get :: Get Int
replicateM n get
so that's a length-prefixed list, strictly. Which is possibly where the
stack's being consumed. Does just bumping the stack size a bit help?
Alternatively, you could consider serialising the Map in some other
format (i.e. newtype the list, and serialise that say, in a lazy/chunked
> I also noticed another issue while testing. If my program loads
> the data at startup by calling loadState then all later calls to
> saveState give an error:
> Log: savedState.bin: openFile: resource busy (file is locked)
> this does not occur if the program wasnt loaded. My best guess here
> is that B.readFile isnt completing and closing the file for some
> reason. Is there a good way to force this?
Lazy IO. So force the result to be evaluated, and then close the handle,
or use strict bytestring reading.
More information about the Haskell-Cafe