[Haskell-beginners] How to improve lazyness of a foldl (and memory footprint)

Giacomo Tesio giacomo at tesio.it
Tue May 14 11:22:27 CEST 2013


Hi, I'm trying to improve a small haskell program of mine.
A more extended description with full source code is here:
http://codereview.stackexchange.com/questions/26107/how-to-improve-readability-and-memory-footprint-of-this-haskell-script

The script transforms CSV files into other CSV files but looks like it's
reading the whole input files before writing output files.

I guess that the script can be improved in many ways, in readability and
efficiency, thus any suggestion is wellcome as an occasion to learn.

But what I can't understand is why this design doesn't work:

transformFile :: FilePath -> ([String] -> a) -> (a -> IO r) -> IO r
transformFile file operation continuation = withFile file ReadMode (\h
-> hGetContents h >>= (continuation.operation.lines))

This function recieves a path, a function to left fold lines to a new list
of objects and a function to persist the fold output to files.

Here the relevant parts:

importTrades :: FilePath -> FilePath -> IO ()
importTrades outDir csvFile = transformFile csvFile
(foldTradingSample.getTickWriteTrades) (saveTradingSamples outDir)
    where getTickWriteTrades = filter (isBetween (9, 0) (18,
0)).(catMaybes.(map fromCSVLine))
          foldTradingSample = foldl toTradingSample []

This is the folding function:

toTradingSample :: [TradingSample] -> Tw.Trade -> [TradingSample]
toTradingSample (current:others) twTrade
    | newEqt == equity current && newDay == day current = (current {
trades = newTrades }):others
    | otherwise = current : toTradingSample others twTrade
    where newEqt = Tw.tSimbol twTrade
          newDay = Tw.tDate twTrade
          newTrade = fromTickWrite twTrade
          newTrades = trades current ++ [newTrade]
toTradingSample [] twTrade = [TradingSample { equity = Tw.tSimbol twTrade
                                            , day = Tw.tDate twTrade
                                            , trades = [fromTickWrite twTrade]
                                            }]

And this is the function that safe the fold results to files

saveTradingSamples :: String -> [TradingSample] -> IO ()
saveTradingSamples folder samples = mapM_ (saveTradingSample folder) samples

saveTradingSample :: String -> TradingSample -> IO ()
saveTradingSample folder sample = writeFile fileName contents
    where fileName = folder ++ "\\" ++ (equity sample) ++ "_" ++
(formatTime defaultTimeLocale "%F" $ day sample) ++ ".CSV"
          contents = tradingSampleToCSV sample


What's wrong here?

My insight is that the problem is in the signature of transform files, that
requires to completely compute the list of TradingSample before calling
saveTradingSamples.
Is this the problem? How can I fix this?


Giacomo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20130514/b7c744f8/attachment-0001.htm>


More information about the Beginners mailing list