[Haskell-cafe] serializing large data structures, stack overflow

Don Stewart dons at galois.com
Sat Mar 7 14:57:57 EST 2009


Increase the stack size, or use a different serialiser (they're only a
half dozen lines to write), or different data structure?

-- Don

frigginfriggins:
> I'm playing around with Netflix, implementing a simple KNN-algorithm, I will
> later try SVD which seems to be the most successful approach.
> 
> Using a database like Postgresqk is to slow so I want to serialize a
> datastructure containing the ratings. I'm not sure about the
> representation I will use just yet, if I should use multiple arrays or an Map/
> IntMap.
> 
> However I tried Data.Binary and already for small sizes I get stack overflow
> when deserializing.
> The serializing works fine but when bringing it back it overflows.
> How can I solve this? This is just 2MB, I will eventually need soemthing like
> 2-500MB to store everything depending on what representatin I choose.
> 
> module Serialize where
> import qualified Data.Binary as B
> import qualified Data.Binary.Put as P
> import qualified Data.Map as M
> import qualified Data.List as L
> 
> genTest :: Int -> M.Map (Int,Int) Int
> genTest n = let movies = take n $ repeat 1
>                 grades = take n $ repeat 4 in
>             M.fromList $ ([1..n] `zip` movies) `zip` grades
> 
> main = do
>   let a = genTest 50000
>   B.encodeFile "C:/users/saftarn/desktop/bintest.txt" a
>   print "Success"
> 
> dec = B.decodeFile "C:/users/saftarn/desktop/bintest.txt" >>= \a ->
>       return $ (a :: M.Map (Int,Int) Int)
> 
> 
> 
> 

> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list