[Haskell-beginners] Re: Lazy file IO & Space leaks/waste

Heinrich Apfelmus apfelmus at quantentunnel.de
Fri Nov 6 06:06:57 EST 2009


Aleksandar Dimitrov wrote:
> The important bits are as follows:
> 
>> mf :: [C.ByteString] -> StdWord
>> mf [] = Word [] C.empty
>> mf s = Word (tail s) (head s)
>>
>> f' = mf . reverse . C.words
> 
>> main :: IO ()
>> main = do
>>     corpus_name <- liftM head getArgs
>>     corpus <- liftM (Corpus . (map f') . C.lines) $ C.readFile corpus_name
>>     print $ length (content corpus)
>>     let interesting = filterForInterestingTags interestingTags corpus
>>     print $ show (freqMap interesting)
> 
>  [...]
>
> Ideally, only a very smart part of the file should ever be in memory, with
> processing happening incrementally!

The

    print $ length (content corpus)

statement seems contradictory to your goal? After all, the whole file is
read into the  corpus  variable to calculate its  length .


Regards,
apfelmus

--
http://apfelmus.nfshost.com



More information about the Beginners mailing list