[Haskell-beginners] Re: When, if ever, does Haskell "calculate once"?

Maciej Piechotka uzytkownik2 at gmail.com
Thu May 6 16:49:40 EDT 2010


On Thu, 2010-05-06 at 15:37 -0400, Brent Yorgey wrote:
> 
> On Thu, May 06, 2010 at 11:35:15AM -0700, Travis Erdman wrote:
> > Two questions here, I think they might be related, perhaps even the
> same, but I am not sure, so I will ask both:
> > 
> > Q1:  Re the function f below, I like the first implementation as
> it's "cleaner", but is the second implementation necessary for
> performance purposes?
> > 
> > -- g = some CPU intensive function
> > 
> > -- alternate 1
> > f a b = c + (g b)
> >     where
> >         c = dosomethingelse a (g b)
> >         
> > -- alternate 2        
> > f a b = c + saveit
> >     where
> >         saveit = g b    
> >         c = dosomethingelse a saveit
> 
> You need alternative 2.  In general, GHC (and, I imagine, other
> Haskell compilers) do not do much common subexpression elimination,
> because in some cases it can be a *pessimization*.  The classic
> example is
> 
>   length [1..1000000] + length [1..1000000]
> 
> vs
> 
>   let a = [1..1000000] in length a + length a
> 
> The first will execute in constant space, since each list will be
> lazily generated as needed by the length function and then the garbage
> collector will come along behind length and get rid of the nodes that
> have already been processed.  However, in the second expression, the
> garbage collector cannot get rid of the nodes that are already
> processed by the first call to length, since the second call to length
> still needs the list.  So the entire list [1..1000000] will end up
> being in memory at once. 


Hmm:

> cFoldr :: (a -> b -> b) -> (a -> c -> c) -> (b, c) -> [a] -> (b, c)
> cFoldr f g ~(b, c) []     = (b, c)
> cFoldr f g ~(b, c) (x:xs) = let (b', c') = cFoldr f g (b, c) xs
>                             in (f x b', g x c')
> 
> cFoldl' :: (b -> a -> b) -> (c -> a -> c) -> (b, c) -> [a] -> (b, c)
> cFoldl' f g bc xs0 = lgo bc xs0
>                      where lgo (b, c) [] = (b, c)
>                            lgo (b, c) (x:xs) =
>                                let b' = f b x
>                                    c' = g c x
>                                    bc' = b' `seq` c' `seq` (b', c')
>                                in bc' `seq` lgo bc' xs
> 
> lengthFold :: Int -> a -> Int
> lengthFold !n _ = n + 1
> 
> size = 100000000
>

 
> main = let a = [1..size]
>            l = length a + length a
>       in print $! l

200000000
   8,031,190,480 bytes allocated in the heap
         741,264 bytes copied during GC
           1,920 bytes maximum residency (1 sample(s))
          28,216 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0: 15318 collections,     0 parallel,  0.14s,  0.18s
elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s
elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    7.15s  (  7.26s elapsed)
  GC    time    0.14s  (  0.18s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    7.29s  (  7.44s elapsed)

  %GC time       1.9%  (2.5% elapsed)

  Alloc rate    1,123,257,562 bytes per MUT second

  Productivity  98.1% of total user, 96.1% of total elapsed

./a.out +RTS -s  7.29s user 0.05s system 98% cpu 7.450 total

> main = print $! length [1..size] + length [1..size]

200000000
  16,062,318,576 bytes allocated in the heap
       1,476,904 bytes copied during GC
           2,000 bytes maximum residency (1 sample(s))
          27,992 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0: 30637 collections,     0 parallel,  0.29s,  0.37s
elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s
elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time   15.57s  ( 15.90s elapsed)
  GC    time    0.29s  (  0.37s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   15.87s  ( 16.27s elapsed)

  %GC time       1.8%  (2.3% elapsed)

  Alloc rate    1,031,313,475 bytes per MUT second

  Productivity  98.2% of total user, 95.7% of total elapsed

./a.out +RTS -s  15.87s user 0.11s system 98% cpu 16.272 total


> main =
>   print $! uncurry (+) (cFoldl' lengthFold lengthFold
>                                 (0, 0) [1..size])

200000000
  13,652,979,960 bytes allocated in the heap
       2,089,600 bytes copied during GC
           2,056 bytes maximum residency (1 sample(s))
          27,984 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0: 26041 collections,     0 parallel,  0.30s,  0.39s
elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s
elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time   21.92s  ( 23.44s elapsed)
  GC    time    0.30s  (  0.39s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   22.22s  ( 23.84s elapsed)

  %GC time       1.3%  (1.7% elapsed)

  Alloc rate    622,864,614 bytes per MUT second

  Productivity  98.7% of total user, 92.0% of total elapsed

./a.out +RTS -s  22.22s user 0.16s system 93% cpu 23.843 total

All are compiled with optimizations. 

Regards
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
Url : http://www.haskell.org/pipermail/beginners/attachments/20100506/99897688/attachment.bin


More information about the Beginners mailing list