[Haskell] Re: Probably a trivial thing for people knowing Haskell
apfelmus
apfelmus at quantentunnel.de
Tue Oct 21 04:52:03 EDT 2008
Friedrich wrote:
> Paul Johnson writes:
>>
>> -- Concatenate all the files into one big string. File reading is
>> lazy, so this won't take all the memory.
>> getAllFiles :: [String] -> IO String
>> getAllFiles paths = do
>> contents <- mapM getFile paths
>> return $ concat contents
>>
>> Then use "lines" to split the result into individual lines and process
>> them using "filter", "map" and "foldr". Because file reading is lazy,
>> each line is only read when it is to be processed, and then gets
>> reaped by the garbage collector. So it all runs in constant memory.
>
> Would you mind to elaborate a bit about it. What's so terrible to open
> one file after the other, reading it line by line and close the file
> thereafter.
It's not beautiful.
Here's a more idiomatic version
{-# LANGUAGE BangPatterns #-}
module Main where
import Control.Monad
import System.Directory
import Text.Regex
import Data.List
import Data.Maybe
main = do
files <- filter_reg "[0-9].*" `liftM` getDirectoryContents "."
(sum,count) <- sumcount `liftM` mapM run_file files
let dd = fromIntegral sum / fromIntegral count
putStrLn $ "Download = " ++ show sum
++ " in " ++ show count
++ " days are " ++ show dd ++ " downloads/day"
sumcount :: [(Integer,Int)] -> (Integer,Int)
sumcount = foldl' (\(!s,!c) (ds,dc) -> (s+ds,c+dc)) (0,0)
run_file name =
(sumcount . map check_line . lines) `liftM` readFile' name
readFile' name = unsafeInterleaveIO $
openFile name ReadMode >>= hGetContents
regexp = mkRegex "([0-9]+) Windows ex"
check_line line = case matchRegex regexp line of
Just (s:_) -> (read s,1)
Nothing -> (0,0)
filter_reg pat = let reg = mkRegex pat in
filter $ isJust . matchRegex reg
It's much shorter and should run in constant memory as well.
Regards,
apfelmus
More information about the Haskell
mailing list