[Haskell-beginners] Data.Binary.Get for large files
Philip Scott
haskell-beginners at foo.me.uk
Thu Apr 29 18:37:59 EDT 2010
Hello again folks,
Sorry to keep troubling you - I'm very appreciative of the help you've
given so far. I've got one more for you that has got me totally stumped.
I'm writing a program which deals with largish-files, the one I am using
as a test case is not stupidly large at about 200mb. After three
evenings, I have finally gotten rid of all the stack overflows, but I am
unfortunately left with something that is rather unfeasably slow. I was
hoping someone with some keener skills than I could take a look, I've
tried to distill it to the simplest case.
This program just reads in a file, interpreting each value as a double,
and does a sort of running average on them. The actual function doesn't
matter too much, I think it is the reading it in that is the problem.
Here's the code:
import Control.Exception
import qualified Data.ByteString.Lazy as BL
import Data.Binary.Get
import System.IO
import Data.Binary.IEEE754
myGetter acc = do
e <- isEmpty
if e == True
then
return acc
else do
t <- getFloat64le
myGetter $! ((t+acc)/2)
myReader file = do
h <- openBinaryFile file ReadMode
bs <- BL.hGetContents h
return $ runGet (myGetter 0) bs
main = do
d <- myReader "data.bin"
evaluate d
This takes about three minutes to run on my (fairly modern) laptop.. The
equivilant C program takes about 5 seconds.
I'm sure I am doing something daft, but I can't for the life of me see
what. Any hints about how to get the profiler to show me useful stuff
would be much appreciated!
All the best,
Philip
PS: If, instead of computing a single value I try and build a list of
the values, the program ends up using over 2gb of memory to read a 200mb
file.. any ideas on that one?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20100429/5862ff1a/attachment.html
More information about the Beginners
mailing list