a faster,
accumulating mapM (was Re: [Haskell-cafe] mapM vs mapM_ performance)
Ben
midfield at gmail.com
Thu Apr 24 17:14:17 EDT 2008
Hello Luke and other Haskellers,
Thanks for the reply, but as I noted before, the amount of memory
allocated (and resident) is roughly the same. Anyhow it's definitely
not a GC issue because I wrote an accumulating version of mapM and got
close to mapM_ 's performance.
In the code below, main1 is mapM_, main2 is the current mapM
(basicallly sequence . map), map3 is a hand-coded accumulating
parameter version, mapM2 is the accumulating parameter mapM and main4
uses mapM2. The timings I get are about 15, 175, 20 and 20 seconds
for main1, main2, main3 and main4 respectively. main2 uses about 2%
less memory than main3 or main4 on this particular run, though I don't
know if that is true generally.
Unless someone can see a reason why mapM2 is not as good as mapM, can
I suggest replacing the implementation of mapM by the implementation
of mapM2. A 10x speedup seems to be a bigger deal than GCing 2% more
memory.
best regards, Ben
module Main where
import System.IO (openFile, IOMode(..), hPutStr)
testlst = let ls = [(i, [(j, (fromIntegral j)::Float) | j <-
[1..5]::[Int]]) | i <- [1..500000]::[Int]]
in ls
main = do
h <- openFile "bardump" WriteMode
mapM_ ((hPutStr h) . show) testlst
main2 = do
h <- openFile "bardump2" WriteMode
result <- mapM ((hPutStr h) . show) testlst
print $ length result
main3 = do
h <- openFile "bardump3" WriteMode
result <- dump h testlst []
print $ length result
where dump h (x:xs) accum = do
hPutStr h $ show x
dump h xs $ ():accum
dump _ [] accum = return accum
mapM2 :: Monad m => (a -> m b) -> [a] -> m [b]
{-# INLINE mapM2 #-}
mapM2 fn lst = mapM2accum fn lst []
where mapM2accum _ [] accum = return accum
mapM2accum fn (x:xs) accum = do
r <- fn x
mapM2accum fn xs (r:accum)
main4 = do
h <- openFile "bardump2" WriteMode
result <- mapM2 ((hPutStr h) . show) testlst
print $ length result
On Thu, Apr 24, 2008 at 1:37 AM, Luke Palmer <lrpalmer at gmail.com> wrote:
> On Tue, Apr 22, 2008 at 11:32 AM, Ben <midfield at gmail.com> wrote:
> > Hello Haskellers,
> >
> > I'm running ghc 6.8.2 on vista 64. Consider the following program,
> > which is compiled with -02 -prof -auto-all:
> >
> > module Main where
> >
> > import System.IO (openFile, IOMode(..), hPutStr)
> >
> > testlst = let ls = [(i, [(j, (fromIntegral j)::Float) | j <-
> > [1..5]::[Int]]) | i <- [1..500000]::[Int]]
> > in ls
> >
> > main2 = do
> > h <- openFile "bardump" WriteMode
> > mapM_ ((hPutStr h) . show) testlst
> >
> >
> > main = do
> > h <- openFile "bardump2" WriteMode
> > mapM ((hPutStr h) . show) testlst
> > return ()
> >
> > main and main2 are different in only that mapM_ versus mapM_ are used.
> > But the mapM version runs about 20x slower! I'm running with +RTS -p
> > -hc -RTS and I see that the amount of memory allocated is about the
> > same, and I think the resident memory is about the same too. But the
> > mapM_ version runs in about 8.7 seconds, and the mapM version takes
> > 167 seconds.
>
> My first guess is that the garbage collector is not running at all in
> the mapM_ version, but is working it's ass off in the mapM version
> cleaning up the list that will never be used.
>
>
> > You may ask, why use mapM if you're discarding the values?
> > Unfortunately in my real app I need the values, which are more
> > interesting than IO ().
>
> If you need the values, then you've got to pay that price I suppose.
> If you need the values, I'm going to take a stab that in your real app
> you use a lot of memory because of this (because presumably you're
> keeping the values around), whereas you're just seeing a speed hit on
> this small test program.
>
> Luke
>
More information about the Haskell-Cafe
mailing list