# [Haskell-cafe] representations of probability distributions

Benjamin Redelings benjamin.redelings at gmail.com
Tue Jul 4 14:24:13 UTC 2017

```Hi,

On 07/03/2017 05:14 PM, Olaf Klinke wrote:
> Dear cafe,
>
> following up the announcement of bali-phy and the discussion that ensued, I'd like ask the cafe for an overview of representations of probability distributions as data types. It seems that all packages mentioned to far use some flavour of Markov chain. Why? What else is there? Can someone recommend a survey publication/book/talk?

One other class of inference strategies is Sequential Monte Carlo (aka
Particle filters). Unlike MCMC, these methods run multiple points
("particles") in parallel, and the points can have unequal weights.  The
point cloud is incrementally updated as information is added, so these
methods work well for online algorithms where you sequentially add data
points.  One idea in the probabilistic programming community is to
separate the inference strategy from the inference problem, so that one
could apply "pluggable" inference strategies. (See
https://arxiv.org/abs/1404.0099).

There are also "variational" methods, which I think restrict the
posterior distribution to a particular analytic form, and use an
iterative algorithm to find the parameter values of the restricted
distribution that minimize the KL distance to the true posterior
distribution.  So, this is not a Monte Carlo method.

> Explicitly I am interested in:
> - the data type, as in Haskell's type system
> - how an element of the data type relates to the distribution it describes
> - how the monad instance works (if there is one)
> - how to compute conditionals (if computable)
> - how to compute integrals (if computable)
> - what base types are supported (finite types, reals, products ...)
>
> Below is an incomplete, and possibly oversimplified (or even wrong), list of my own.
> -- Olaf
>
> 1. Markov chains
> - uses a random number generator
> - type is something like
>    Markov a = a -> IO a
> - repeated execution produces an infinite trace xs :: [a]
>    where for each predicate p :: a -> Bool, the limit of
>    (length (filter p (take n xs))) % n
>    as n tends to infinity approaches the probability of p
>    under the distribution modeled.
> - conditionals?
> - integrals?
> - supported base types a in Markov a?
>
> 2. Riesz functionals: possibly the most functional representation
> - representation as linear functionals
>    Riesz a = (a -> R) -> R
>    for a suitable real number type R
> - embed predicates in the type (a -> R) via characteristic functions
>    characteristic p = \a -> if p a then 1 else 0
>    Then for a functional phi :: Riesz a and p :: a -> Bool,
>    phi (characteristic p) is the probability of p.
> - Monad instance is the same as for the continuation monad
> - conditionals?
> - integration is trivial: evaluation
> - arbitrary base types
> - but how would you even declare a uniform distribution in this type?
>
> 3. Density functions
> - represent a distribution on the reals as density function w.r.t. the Lebesgue measure
> - limited to reals (and products thereof)
> - data type possibly holds only parameters of parametric families (gaussian, beta, gamma, ...)
> - computing conditionals only for matching pairs of prior and sampling function
>    (conjugate priors), analytically
> - integration analytically
> - no monad instance because no arbitrary base type
>
> 4. Finite distributions
> - type Dist a = [(a,R)] for suitable real number type R
> - optimizations use search trees instead of flat lists
> - basically composition of list and writer monads,
>    where R is regarded as monoid under multiplication
> - represents a distribution of finite support.
>    If phi :: Dist a, then
>    probability p = sum \$ map snd \$ filter (p.fst) phi
> - conditioning possible if base type is of Eq class and R supports division
> - gets unwieldy without Ord instance of base type to keep the list small
>
> 5. Valuations
> - type Val a = (a -> Bool) -> R
>    for suitable real number type R
> - models mathematical definition closely
> - monad bind is integration: requires exact reals R due to limit process
> - conditioning possible if R supports division
> _______________________________________________