[Haskell-cafe] monadic MapReduce
Anish Muttreja
anishmuttreja at gmail.com
Mon Mar 2 18:57:44 EST 2009
On Mon, Mar 02, 2009 at 04:10:41PM +0100, Manlio Perillo wrote:
> Anish Muttreja ha scritto:
>> On Sun, Mar 01, 2009 at 07:25:56PM +0100, Manlio Perillo wrote:
>>> Hi.
>>>
>>> I have a function that do some IO (take a file path, read the file,
>>> parse, and return some data), and I would like to parallelize it, so
>>> that multiple files can be parsed in parallel.
>>>
>>> I would like to use the simple mapReduce function,
>>> from Real Word Haskell:
>>>
>>> mapReduce :: Strategy b -- evaluation strategy for mapping
>>> -> (a -> b) -- map function
>>> -> Strategy c -- evaluation strategy for reduction
>>> -> ([b] -> c) -- reduce function
>>> -> [a] -- list to map over
>>> -> c
>>>
>>> mapReduce mapStrat mapFunc reduceStrat reduceFunc input =
>>> mapResult `pseq` reduceResult
>>> where mapResult = parMap mapStrat mapFunc input
>>> reduceResult = reduceFunc mapResult `using` reduceStrat
>>>
>>> Is this possible?
>>>
>>>
>>> Thanks Manlio Perillo
>>
>> Would this work?
>
> I suspect that it will not work..
>
>> Read in each file into a string (or byteString) using a lazy function
>> and then call mapReduce with the strings instead of file paths.
>>
>> import qualified Data.Bytestring.Lazy.Char8 as L
>> do
>> let handles = map (openFile ) files
>> strings <- mapM L.hGetContents handles
>> let result = mapReduce ...
>>
>> The actual work of reading in the file should happen on-demand inside
>> the parsing function called by mapReduce.
>>
>
> By doing this I will probably lose any control about file resources usage.
OK.
How about this. Is there a reason why I can't
replace the variables b and c in the type signature of mapReduce with with (IO b')
and (IO c'). b and c can be any types.
mapReduce :: Strategy (IO b') -- evaluation strategy for mapping
-> (a -> IO b') -- map function
-> Strategy (IO c') -- evaluation strategy for reduction
-> ([IO b'] -> (IO c')) -- reduce function
-> [a] -- list to map over
-> (IO c')
Just remember to wrap all values back in the IO monad.
Anish
>
>
> Thanks Manlio
More information about the Haskell-Cafe
mailing list