[Haskell-cafe] monadic MapReduce
manlio_perillo at libero.it
Mon Mar 2 10:10:41 EST 2009
Anish Muttreja ha scritto:
> On Sun, Mar 01, 2009 at 07:25:56PM +0100, Manlio Perillo wrote:
>> I have a function that do some IO (take a file path, read the file,
>> parse, and return some data), and I would like to parallelize it, so
>> that multiple files can be parsed in parallel.
>> I would like to use the simple mapReduce function,
>> from Real Word Haskell:
>> mapReduce :: Strategy b -- evaluation strategy for mapping
>> -> (a -> b) -- map function
>> -> Strategy c -- evaluation strategy for reduction
>> -> ([b] -> c) -- reduce function
>> -> [a] -- list to map over
>> -> c
>> mapReduce mapStrat mapFunc reduceStrat reduceFunc input =
>> mapResult `pseq` reduceResult
>> where mapResult = parMap mapStrat mapFunc input
>> reduceResult = reduceFunc mapResult `using` reduceStrat
>> Is this possible?
>> Thanks Manlio Perillo
> Would this work?
I suspect that it will not work..
> Read in each file into a string (or byteString) using a lazy function
> and then call mapReduce with the strings instead of file paths.
> import qualified Data.Bytestring.Lazy.Char8 as L
> let handles = map (openFile ) files
> strings <- mapM L.hGetContents handles
> let result = mapReduce ...
> The actual work of reading in the file should happen on-demand inside the
> parsing function called by mapReduce.
By doing this I will probably lose any control about file resources usage.
More information about the Haskell-Cafe