[Haskell-cafe] binary package: memory problem decoding an IntMap
manlio_perillo at libero.it
Thu Apr 2 05:54:34 EDT 2009
I'm having memory problems decoding a big IntMap.
The data structure is:
IntMap (UArr (Word16 :*: Word8))
There are 480189 keys, and a total of 100480507 elements
The size of the encoded (and compressed) data is 184 MB.
When I load data from the Netflix Prize data set, total memory usage is
However when I try to decode the data, memory usage grows too much (even
using the -F1.1 option in the RTS).
The problem seems to be with `fromAscList` function, defined as:
fromList :: [(Key,a)] -> IntMap a
= foldlStrict ins empty xs
ins t (k,x) = insert k x t
(by the way, why IntMap module does not use Data.List.foldl'?).
The `ins` function is not strict.
This seems an hard problem to solve.
First of all, IntMap should provide strict variants of the implemented
And the binary package should choose whether use the strict or lazy version.
For me, the simplest solution is to serialize the association list
obtained from `toAscList` function, instead of directly serialize the
The question is: can I "reuse" the data already serialized?
Is the binary format of `IntMap a` and `[(Int, a)]` compatible?
Thanks Manlio Perillo
More information about the Haskell-Cafe