[Haskell-cafe] monadic MapReduce

Tue Mar 3 12:27:35 EST 2009

> How about this. Is there a reason why I can't
> replace the variables b and c in the type signature of mapReduce with with (IO b')
> and (IO c'). b and c  can be any types.
>
> mapReduce :: Strategy (IO b')    -- evaluation strategy for mapping
>           -> (a -> IO b')      -- map function
>           -> Strategy (IO c')    -- evaluation strategy for reduction
>           -> ([IO b'] -> (IO c'))    -- reduce function
>           -> [a]           -- list to map over
>           -> (IO c')
>
> Just remember to wrap all values back in the IO monad.

Remember, the idea of map-reduce is to provide a very restrictive
programming interface so that you have a lot of flexibility in
your execution strategy.  If you start loosening the interface
you will still be able to execute the program, but you may
not be able to perform all the great optimizations you want to
perform.  For example, if you are using IO actions that are
stateful, what are the semantics?  Can one map action affect
other map actions?  Does this imply an ordering of the map functions?
Does this imply they all run on the same machine or at least have
state communicated between the machines on which they run?

The austere interface precludes any of these issues, and therein
lies the beauty.

> Anish

Btw. I prefer the sawzall formulation over the map-reduce formulation.
A sawzall program just specifies how to map some data to a monoid
and the system is free to mappend the monoid values in whatever order
it wants (by applying associativity).

Tim Newsham
http://www.thenewsh.com/~newsham/