[Haskell-cafe] Processing of large files

Henning Thielemann iakd0 at clusterf.urz.uni-halle.de
Mon Nov 1 13:06:10 EST 2004


On Mon, 1 Nov 2004, Alexander N. Kogan wrote:

> import System.Environment
> 
> merge [] x = [(x,1)]
> merge (e@(a,b):xs) x | x == a  = (a,b+1):xs
>                      | otherwise  = e : merge xs x
> 
> procFile =
>     putStrLn       .
>     show       .
>     foldl merge []      .
>     words
> 
> main = do
>     args <- getArgs
>     readFile (head args) >>= procFile


> How should I modify it to make it useful on large file?
> It eats too much memory...

Hm, if speed would be your problem, I'd suggest FiniteMap or some other
dictionary data type. :-) But there is probably no way to significantly
reduce memory requirements if the number of different words is big. Maybe
you can reduce the requirements by a constant factor by a different String
representation.  (PackedString?) 



More information about the Haskell-Cafe mailing list