[Haskell-cafe] Allocating enormous amounts of memory and
Donald Bruce Stewart
dons at cse.unsw.edu.au
Tue Jul 10 23:09:28 EDT 2007
> I switched to Data.Binary, which dropped me from 2.6GB to 1.5GB, and
> then I switched this afternoon to unboxed arrays from lists of floats,
> and that dropped me again from 1.5GB to 475MB. I think, all told, that
> I'm in an acceptable range now, and thank you for pointing out the
> library mistake. I'm also down from 1.5 minutes load time to under 10
> seconds of load time, which is also very very nice. Incidentally, the
> code I'm now using is:
> binaryLoadDocumentCoordinates ::
> String -> IO (Ptr Float, Array.UArray Int Int)
> binaryLoadDocumentCoordinates path = do
> putStrLn "File opened"
> coordinates <- decodeFile (path ++ "/Clusters.bin") :: IO
> (Array.UArray Int Float)
> print . Array.bounds $ coordinates
> putStrLn "Got coordinates"
> galaxies <- decodeFile (path ++ "/Galaxies.bin") :: IO (Array.UArray
> Int Int)
> putStrLn "Got galaxies"
> coordinatesArr <- mallocArray . snd . Array.bounds $ coordinates
> putStrLn "Allocated array"
> pokeArray coordinatesArr . Array.elems $ coordinates
> return (coordinatesArr, galaxies)
> binarySaveDocumentCoordinates :: String -> [Point] -> IO ()
> binarySaveDocumentCoordinates path points = do
> let len = length points
> encodeFile (path ++ "Clusters.bin") . (Array.listArray (0,len*3) ::
> [Float] -> Array.UArray Int Float) . coordinateList . solve $ points
> encodeFile (path ++ "Galaxies.bin") . (Array.listArray (0,len) ::
> [Int] -> Array.UArray Int Int) . galaxyList $ points
I've just pushed a patch to Data.Binary in the darcs version that should
help serialising arrays by avoiding forcing an intermediate list.
You can get that here:
darcs get http://darcs.haskell.org/binary
I'd still avoid that 'listArray' call though, you may as well just write
the list out, rather than packing it into an array, and then serialising
the array back as a list.
More information about the Haskell-Cafe