[Haskell-beginners] Performance of Idiomatic lazy Haskell

Stephen Tetley stephen.tetley at gmail.com
Sun Jan 31 09:52:41 EST 2010


Hi Daniel

Thanks - the figures are very impressive for the stream fusion
library. I knew the paper, but I hadn't looked at it the
implementation.

Making a stricter unfoldr by using a strictness annotation on the
state and getting rid of the tuple is nowhere near stream fusion-lib,
i.e:

data Maybe2 a st = Nothing2 | Just2 a !st
  deriving (Eq,Show)


Performance wise stream fusion even beats a monoidal unfoldr. A
monoidal unfoldr seems reasonable to me for this problem as there is
no need to generate a list. As a monoidal unfoldr is not in the
'standard' libraries some people might not consider it idiomatic
though.

I put in a fixed value for epsilon in all three version rather than
used echo at the command line and compiled all with ghc --make -O2

--------------

module Main (main) where

import Data.Monoid

data Maybe2 a st = Nothing2 | Just2 a !st
  deriving (Eq,Show)


dummy_eps :: Double
dummy_eps = 0.00000001

main :: IO ()
main = do
    putStrLn "EPS: "
    eps <- return dummy_eps
    let mx = floor (4/eps)
        !k = (mx+1) `quot` 2
    putStrLn $ "PI mit EPS " ++ (show eps) ++ " = " ++ show (leibniz k)

leibniz n = (4 *) $ getSum $ step n

step :: Integer -> Sum Double
step times = unfoldrMon phi (0,True,1) where
   phi (i,_,_)   | i == times  = Nothing2
   phi (i,sig,d) | sig         = Just2 (Sum (1/d)) (i+1,False,d+2)
                 | otherwise   = Just2 (Sum (negate (1/d))) (i+1,True,d+2)


unfoldrMon      :: Monoid a => (b -> Maybe2 a b) -> b -> a
unfoldrMon f b  =
  case f b of
   Just2 a new_b -> a `mappend` unfoldrMon f new_b
   Nothing2      -> mempty


--

---------------------------------------------------
Leibniz1 (stream fusion)
---------------------------------------------------

$ ./Leibniz1 +RTS -sstderr -RTS
d:\coding\haskell\cafe\Leibniz1.exe +RTS -sstderr
EPS:
PI mit EPS 1.0e-8 = 3.1415926445727678
          24,404 bytes allocated in the heap
             892 bytes copied during GC
           3,068 bytes maximum residency (1 sample(s))
          13,316 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:     0 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.03s  (  0.00s elapsed)
  MUT   time    4.59s  (  4.59s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    4.63s  (  4.59s elapsed)

  %GC time       0.0%  (0.0% elapsed)

  Alloc rate    5,276 bytes per MUT second

  Productivity  99.3% of total user, 100.0% of total elapsed

---------------------------------------------------
Leibniz3 (monoidal unfoldr)
---------------------------------------------------

$ ./Leibniz3 +RTS -sstderr -RTS
d:\coding\haskell\cafe\Leibniz3.exe +RTS -sstderr
EPS:
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
      92,448,984 bytes allocated in the heap
          15,368 bytes copied during GC
       8,382,800 bytes maximum residency (5 sample(s))
       4,198,684 bytes maximum slop
              17 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:   125 collections,     0 parallel,  1.66s,  1.66s elapsed
  Generation 1:     5 collections,     0 parallel,  0.03s,  0.03s elapsed

  INIT  time    0.03s  (  0.00s elapsed)
  MUT   time    3.50s  (  3.55s elapsed)
  GC    time    1.69s  (  1.69s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    5.22s  (  5.23s elapsed)

  %GC time      32.3%  (32.2% elapsed)

  Alloc rate    26,180,243 bytes per MUT second

  Productivity  67.1% of total user, 66.9% of total elapsed

---------------------------------------------------
Leibniz2 - no stream fusion, unfoldr with strictness annotation on the state
---------------------------------------------------

$ ./Leibniz2 +RTS -sstderr -RTS
d:\coding\haskell\cafe\Leibniz2.exe +RTS -sstderr
EPS:
PI mit EPS 1.0e-8 = 3.141592648589476
  25,600,024,064 bytes allocated in the heap
       2,152,224 bytes copied during GC
           3,336 bytes maximum residency (1 sample(s))
          11,908 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0: 48828 collections,     0 parallel,  2.19s,  2.23s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.02s  (  0.00s elapsed)
  MUT   time   75.16s  ( 76.22s elapsed)
  GC    time    2.19s  (  2.23s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   77.36s  ( 78.45s elapsed)

  %GC time       2.8%  (2.8% elapsed)

  Alloc rate    340,553,219 bytes per MUT second

  Productivity  97.2% of total user, 95.8% of total elapsed


More information about the Beginners mailing list