[Haskell-beginners] Continuations

Michael Snoyman michael at snoyman.com
Thu Jan 1 06:42:28 UTC 2015


To give an idea of why CPS transform can sometimes be more efficient,
consider the following two approaches to a "safe divide" function, which
check if the denominator is 0:

divMay :: Double -> Double -> Maybe Double
divMay x y
    | y == 0 = Nothing
    | otherwise = Just (x / y)

divCPS :: Double -> Double -> a -> (Double -> a) -> a
divCPS x y onFail onSuccess
    | y == 0 = onFail
    | otherwise = onSuccess (x / y)

Presumably, divMay will always force the allocation of a new object when y
is not 0, through its usage of Just. divCPS, on the other hand, does not
need to perform any allocation. I say presumably, because often times the
compiler will be able to optimize this. To see evidence of that, I've put
together a small benchmark[1], If I compile with optimizations turned of
(-O0), I get results showing that CPS is faster. And looking at the
core[2], we can see that divMay really does result in an extra allocation:

$ ghc-core --no-cast -- -O0 foo.hs

divMay'_r454 :: Double -> Double
divMay'_r454 =
  \ (y_a4hf :: Double) ->
    Data.Maybe.fromMaybe
      @ Double
      (D# 0.0)
      (case ==
              @ Double $fEqDouble y_a4hf (D# 0.0)
       of _ [Occ=Dead] {
         False ->
           Data.Maybe.Just
             @ Double
             (/
                @ Double
                $fFractionalDouble
                (D# 10.0)
                y_a4hf);
         True -> Data.Maybe.Nothing @ Double
       })

divCPS'_r455 :: Double -> Double
divCPS'_r455 =
  \ (y_a4hg :: Double) ->
    case ==
           @ Double $fEqDouble y_a4hg (D# 0.0)
    of _ [Occ=Dead] {
      False ->
        id
          @ Double
          (/
             @ Double
             $fFractionalDouble
             (D# 10.0)
             y_a4hg);
      True -> D# 0.0
    }

Yes, core has a lot of stuff going on due to the explicit type signatures,
the "magic hash" unboxed types showing up, etc. But as you can see, there's
a call to Data.Maybe.Just in divMay' that is not present in divCPS'.

Now let's explain that "presumably" comment. GHC is really smart, and if
you let it, it will often times optimize away these kinds of things,
especially in small examples like this. In this case, the benchmark results
show up as almost identical between the two versions, which is hardly
surprising when you look at the core:

$ ghc-core --no-cast -- -O2 foo.hs

divMay'_r4ci :: Double -> Double
divMay'_r4ci =
  \ (y_a4oE :: Double) ->
    case y_a4oE of _ [Occ=Dead] { D# x_a4Ml ->
    case x_a4Ml of wild1_X10 {
      __DEFAULT ->
        case /## 10.0 wild1_X10 of wild2_a4Rj { __DEFAULT ->
        D# wild2_a4Rj
        };
      0.0 -> main9
    }
    }

divCPS'_r4cj :: Double -> Double
divCPS'_r4cj =
  \ (y_a4oF :: Double) ->
    case y_a4oF of _ [Occ=Dead] { D# x_a4Ml ->
    case x_a4Ml of wild1_X15 {
      __DEFAULT ->
        case /## 10.0 wild1_X15 of wild2_a4Rj { __DEFAULT ->
        D# wild2_a4Rj
        };
      0.0 -> main9
    }
    }

If you look past the different variable names, you'll see that the two
functions are in fact identical. So to sum up this (longer than expected)
email:

* CPS allows you to avoid extra allocations,
* but often times GHC can perform that optimization for you itself.

You can look at the attoparsec codebase to see its usage of CPS, which is a
more complicated usage of this same concept. Continuation passing web
frameworks are a totally different story though, and have been covered
pretty well elsewhere (probably best by Alberto[3], though others may have
different links).

[1]
https://www.fpcomplete.com/user/snoyberg/random-code-snippets/cps-performance-benchmark
[2] Core is a lower-level form that GHC compiles code to before generating
machine code. It gives a very good idea of what optimizations have been
applied. I used the ghc-core tool for this. Gabriel Gonzalez wrote a nice
blog post about this a few years back:
http://www.haskellforall.com/2012/10/hello-core.html
[3] https://www.fpcomplete.com/user/agocorona

On Thu Jan 01 2015 at 6:00:12 AM Cary Cherng <ccherng at gmail.com> wrote:

> I read (http://en.wikibooks.org/wiki/Haskell/Continuation_passing_style)
> that in some circumstances, CPS can be used to improve performance by
> eliminating certain construction-pattern matching sequences (i.e. a
> function returns a complex structure which the caller will at some
> point deconstruct). And that attoparsec is an example of this.
>
> I don't see exactly how CPS gives rise to concrete examples of
> performance gains. Moreover how does this arise in parsing for example
> with attoparsec as mentioned.
>
>
> I also encountered various web frameworks such as mflow that are based
> on continuations. How a typical http restful system is made into
> something based around continuations is not something that is obvious
> to me. And what is gained from that?
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20150101/a31a5abf/attachment.html>


More information about the Beginners mailing list