[Haskell-cafe] operating on a hundred files at once

Chad Scherrer chad.scherrer at gmail.com
Tue Apr 10 02:03:34 EDT 2007


Hi Jeff,

> I have a series of NxM numeric tables I'm doing a quick
> mean/variance/t-test etcetera on.  The cell t1 [i,j] corresponds exactly
> to the cells t2..N [i,j], and so it's perfectly possible to read one
> item at a time from each of the 100 files and compute the mean/variance
> etcetera on all cells that way.

So after mapping openAndProcess, you have a 100xNxM "array" (really a
list-of-lists-of-lists), right? And then when you take means and
variances, which index are you doing this with respect to? As I read
it, you seem to be trying to eliminate the first axis, and end up with
an NxM "array".

If this is the case, let's say we have
mean, variance :: [Double] -> Double
openAndProcess :: String -> IO (Matrix String)

Here, defining
type Matrix a = [[a]]
makes it easier to keep the types straight.

Then you have these building blocks:

(map . map . map) read :: [Matrix String] -> [Matrix Double]

transpose2 :: [Matrix Double] -> Matrix [Double]
(a couple of lines, maybe even a one-liner, if you use that [a] is a monad)

(map . map) mean :: Matrix [Double] -> Matrix Double

Composing these gives a function [Matrix String] -> Matrix Double, so
once we get to [Matrix String], we're effectively done.

you also use
map OpenAndProcess :: [String] -> [IO (Matrix String)]

You can use sequence to get the IO outside the list, so now you have
IO [Matrix String]. All you have to do now is use liftM on your
function [Matrix String] -> Matrix Double, which turns it into a
function IO [Matrix String] -> IO (Matrix Double).

-- 

Chad Scherrer

"Time flies like an arrow; fruit flies like a banana" -- Groucho Marx


More information about the Haskell-Cafe mailing list