[Haskell-cafe] performance of map reduce
Manlio Perillo
manlio_perillo at libero.it
Fri Sep 19 12:41:52 EDT 2008
Hi again.
In
http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html#id676390
there is a map reduce based log parser.
I have written an alternative version:
http://paste.pocoo.org/show/85699/
but, with a file of 315 MB, I have [1]:
1) map reduce implementation, non parallel
real 0m6.643s
user 0m6.252s
sys 0m0.212s
2) map reduce implementation, parallel with 2 cores
real 0m3.840s
user 0m6.384s
sys 0m0.652s
3) my implementation
real 0m8.121s
user 0m7.804s
sys 0m0.216s
What is the reason of the map reduce implementation being faster, even
if not parallelized?
It is possible to implement a map reduce version that can handle gzipped
log files?
[1] These tests does not consider the "first run".
For the first run (no data in OS cache), I have (not verified):
1) map reduce implementation, parallel with 2 cores
real 0m3.735s
user 0m6.328s
sys 0m0.604s
2) my implementation
real 0m13.659s
user 0m7.712s
sys 0m0.360s
Thanks Manlio Perillo
More information about the Haskell-Cafe
mailing list