[Haskell-cafe] Probabilistic programming [Was: Monads with "The" contexts?]

Thu Jul 19 10:32:55 CEST 2012

> > Exercise: how does the approach in the code relate to the approaches
> > to sharing explained in
> >         http://okmij.org/ftp/tagless-final/sharing/sharing.html
> >
> Chapter 3 introduces an  implicit impure counter, and Chapter 4 uses a
> database that is passed around.
> let_ in Chapter 5 of sharing.pdf realizes the sharing with sort of
> continuation-passing style.The unsafe counter works across the module
> (c.f. counter.zip) but is generally unpredictable...

The reason sharing has the type m a -> m (m a) rather than
m a -> m a is the fact new calculations to share may be created
dynamically. Therefore, we need a supply of keys (gensym). We count on
the monad m to provide the supply. However, top-level computations
(bound to top-level identifiers) are created only once, at the
initialization time. Therefore, a static assignment of identifiers
will suffice. The static assignment is similar to the manual label
assignment technique -- the first technique described Sec 3 of the
sharing.pdf paper. John T. O'Donnell has automated this manual
assignment using TH.

> Now I'm on to the next task; how we represent continuous probability
> distributions? The existing libraries:
>
> Seemingly have restricted themselves to discrete distributions, or at
> least providing Random support for Monte-Carlo simulations. 

I must warn that although it is ridiculously easy to implement
MonadProb in Haskell, the result is not usable. Try to implement HMM
with any of the available MonadProb and see how well it scales. (Hint: we
want the linear time scaling as we evolve the model -- not
exponential). There is a *huge* gap between a naive MonadProb and
something that could be used even for passingly interesting
problems. We need support for so-called `variable elimination'. We
need better sampling procedures: rejection sampling is better
forgotten. Finally, GHC is actually not a very good language system
for probabilistic programming of generative-model--variety. See
	http://okmij.org/ftp/Haskell/index.html#memo-off
for details. 

To give you the flavor of difficulties, consider a passingly
interesting target tracking problem: looking at a radar screen and
figuring out how many planes are there and where they are going:
	http://okmij.org/ftp/kakuritu/index.html#blip
Since the equipment is imperfect, there could be a random blip on the radar
that corresponds to no target. If we have a 10x10 screen and 2%
probability of a noise blip in every of the 100 `pixels', we get the
search space of 2^100 -- even before we get to the planes and their
stochastic equations of motion. Hansei can deal with the problem --
and even scale linearly with time. 

I don't think you can make a monad out of Gaussian distributions. That
is not to say that monads are useless in these problems -- monads are
useful, but at a different level, to build a code for say, MCMC (Monte
Carlo Markov Chain). It this this meta-programming approach that
underlies Infer.net
	http://research.microsoft.com/en-us/um/cambridge/projects/infernet/
and Avi Pfeffer's Figaro
	http://www.cs.tufts.edu/~nr/cs257/archive/avi-pfeffer/figaro.pdf

I should point out Professor Sato of Toukyou Tech, who is the pioneer
in the probabilistic programming
	http://sato-www.cs.titech.ac.jp/sato/
	http://sato-www.cs.titech.ac.jp/en/publication/
You can learn from him all there is to know about probabilistic
programming.