[Haskell-cafe] Explicit approach to lazy effects For probability monad?

Thu Oct 17 20:03:39 UTC 2019

Hi Benjamin, 

Your example code seems to deal with two distinct types: 
The do-notation is about the effects monad (on the random number generator?) and the `sample` function pulls whatever representation you have for an actual probability distribution into this effect monad. In my mental model, the argument to `sample` represents a function Double -> x that interprets a number coming out of the standard random number generator as an element of type x. 
I suggest to consider the following two properties of the mathematical probability monad (a.k.a. the Giry monad), which I will use as syntactic re-write rules in your modeling language. 

The first property is Fubini's Theorem. In Haskell terms it says that for all f, a :: m x and b :: m y the two terms

do {x <- a; y <- b; f x y}
do {y <- b; x <- a; f x y}

are semantically equivalent. (For state monads, this fails.) Monads where this holds are said to be commutative. If you have two urns, then drawing from the left and then drawing from the right is the same as first drawing from the right and then drawing from the left. Using Fubini, we can swap the first two lines in your example: 

model = do
   cond <- bernoulli 0.5
   x <- normal 0 1
   return (if cond == 1 then x else 0)

This desugars to

bernoulli 0.5 >>= (\cond -> normal 0 1 >>= (\x -> return (if cond == 1 then x else return 0)))
bernoulli 0.5 >>= (\cond -> fmap (\x -> if cond == 1 then x else 0) (normal 0 1))

The second property is a kind of lazyness, namely 

    fmap (const x) and return are semantically equivalent.

which holds for mathematical distributions (but not for state monads). Now one could argue that in case cond == 0 the innermost function is constant in x, in which case the whole thing does not depend on the argument (normal 0 1). The Lemma we need here is semantic equivalence of the following two lambda terms. 

\cond -> \x -> if cond == 1 then x else 0
\cond -> if cond == 1 then id else const 0

If the above is admissible then the following syntactic transformation is allowed: 

model = do
   cond <- bernoulli 0.5
   if cond == 1 then normal 0 1 else return 0

which makes it obvious that the normal distribution is only sampled when needed. But I don't know whether you would regard this as the same model. Notice that I disregarded your `sample` function. That is, my monadic language is the monad of probabilites, not the monad of state transformations of a random number generator. Maybe you can delay using the random number generator until the very end? I don't know the complete set of operations your modeling language sports. If that delay is possible, then maybe you can use a monad that has the above two properties (e.g. a reader monad) and only feed the random numbers to the final model. As proof of concept, consider the following.

type Model = Reader Double
model :: Model Int
model = do
  x <- reader (\r -> last [1..round (recip r)])
  cond <- reader (\r -> r > 0.5)
  return (if cond then x else 0)

runReader model is fast for very small inputs, which would not be the case when the first line was always evaluated. 

Cheers,
Olaf