[Haskell-cafe] Allocating enormous amounts of memory and wondering why

Jefferson Heard jeff at renci.org
Sun Jul 8 17:31:57 EDT 2007


By the way, I've confirmed it doesn't even make it past the call to 

coordinates <- get pointsH :: IO [Float]

It just runs for about 15 seconds and then all the memory is consumed.
I'm using a laptop with 2gb of RAM and a 2.0gHz processor, so I assume
the read shouldn't take that long, since on the wiki, AltBinary says it
can run at around 20-50MB/sec.  I assume I'm doing something *way* wrong
here...

On Sun, 2007-07-08 at 17:26 -0400, Jefferson Heard wrote:
> I'm using the Data.AltBinary package to read in a list of 4.8 million
> floats and 1.6 million ints.  Doing so caused the memory footprint to
> blow up to more than 2gb, which on my laptop simply causes the program
> to crash.  I can do it on my workstation, but I'd really rather not,
> because I want my program to be fairly portable.  
> 
> The file that I wrote out in packing the data structure was only 28MB,
> so I assume I'm just using the wrong data structure, or I'm using full
> laziness somewhere I shouldn't be.
> 
> I've tried compiling with profiling enabled, but I wasn't able to,
> because the Streams package doesn't seem to have an option for compiling
> with profiling.  I'm also a newbie to Cabal, so I'm probably just
> missing something.  
> 
> The fundamental question, though is "Is there something wrong with how I
> wrote the following function?"
> 
> binaryLoadDocumentCoordinates :: String -> IO (Ptr CFloat, [Int])
> binaryLoadDocumentCoordinates path = do
>   pointsH <- openBinaryFile (path ++ "/Clusters.bin") ReadMode
>   coordinates <- get pointsH :: IO [Float]
>   galaxies <- get pointsH :: IO [Int]
>   coordinatesArr <- mallocArray (length coordinates)
>   pokeArray coordinatesArr (map (fromRational . toRational) coordinates)
>   return (coordinatesArr, galaxies)
> 
> I suppose in a pinch I could write a C function that serializes the
> data, but I'd really rather not.  What I'm trying to do is load a bunch
> of coordinates into a vertex array for OpenGL.  I did this for a small
> 30,000 item vertex array, but I need to be able to handle several
> million vertices 
> in the end.  
> 
> If I serialize an unboxed array instead of a list or if I do repeated
> "put_" and "get" calls, will that help with the memory problem?
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list