[Haskell-beginners] Processing a list of files the Haskell way
Michael Schober
Micha-Schober at web.de
Sat Mar 10 12:55:06 CET 2012
Hi everyone,
I'm currently trying to solve a problem in which I have to process a
long list of files, more specifically I want to compute MD5 checksums
for all files.
I have code which lists me all the files and holds it in the following
data structure:
data DirTree = FileNode FilePath | DirNode FilePath [DirTree]
I tried the following:
-- calculates MD5 sums for all files in a dirtree
addChecksums :: DirTree -> IO [(DirTree,MD5Digest)]
addChecksums dir = addChecksums' [dir]
where
addChecksums' :: [DirTree] -> IO [(DirTree,MD5Digest)]
addChecksums' [] = return []
addChecksums' (f@(FileNode fp):re) = do
bytes <- BL.readFile fp
rest <- addChecksums' re
return ((f,md5 bytes):rest)
addChecksums' ((DirNode fp filelist):re) = do
efiles <- addChecksums' filelist
rest <- addChecksums' re
return $ efiles ++ rest
This works fine, but only for a small number of files. If I try it on a
big directory tree, the memory gets junked up and it aborts with an
error message telling me that there are too many open files.
So I guess, I have to sequentialize the code a little bit more. But at
the same time, I want to keep it as functional as possible and I don't
want to write C-like code.
What would be the Haskell way to do something like this?
Thanks for all the input,
Michael
More information about the Beginners
mailing list