[Haskell-cafe] performance of map reduce
Don Stewart
dons at galois.com
Fri Sep 19 13:12:43 EDT 2008
manlio_perillo:
> Hi again.
>
> In
> http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html#id676390
> there is a map reduce based log parser.
>
> I have written an alternative version:
> http://paste.pocoo.org/show/85699/
>
> but, with a file of 315 MB, I have [1]:
>
> 1) map reduce implementation, non parallel
> real 0m6.643s
> user 0m6.252s
> sys 0m0.212s
>
> 2) map reduce implementation, parallel with 2 cores
> real 0m3.840s
> user 0m6.384s
> sys 0m0.652s
>
> 3) my implementation
> real 0m8.121s
> user 0m7.804s
> sys 0m0.216s
>
>
>
> What is the reason of the map reduce implementation being faster, even
> if not parallelized?
Changes in how GC is utilised, or how optimisation works?
> It is possible to implement a map reduce version that can handle gzipped
> log files?
Using the zlib binding on hackage.haskell.org, you can stream multiple
zlib decompression threads with lazy bytestrings, and combine the
results.
-- Don
More information about the Haskell-Cafe
mailing list