[Haskell-cafe] Announce: Probablistic programming in Haskell with bali-phy

Thu Jun 15 20:46:48 UTC 2017

Hi Sergiu,

Thanks for your interest!

On 06/13/2017 12:30 PM, Sergiu Ivanov wrote:
> Probabilistic programs are written using the probability monad.
>> I've implemented a Haskell interpreter in C++ that records execution
>> traces of the Haskell programs.
> Is your probability monad an actual monad that you implemented in a
> Haskell package, or is it a consequence of the fact that you are running
> Haskell-like programs in your interpreter?
1. Its an actual monad.  It allows probability distributions to be 
"performed" by a Haskell function that serves as an interpreter. But I 
haven't implemented it in GHC.  There are actually two interpreters, 
which are at the top of this file.  "IOAndPass" is <<=, and "IOAnd" is 
<<. Please forgive the extreme hackiness :-P
https://github.com/bredelings/BAli-Phy/blob/master/modules/Distributions.hs
The (sample) function performs a distribution d by generating an IO 
action to randomly sample from the distribution.  The (sample') function 
performs a distribution by setting up a node in a (dynamic) graphical 
model, which is also an IO action.

2. Actions in this monad can either
* sample from a distribution.
* specify that we should record samples of an expression.
* alter properties of variables sampled in a given scope.
* observe random outcomes.

For example in the code below, I specify that we should sample a 
dirichlet containing n modifiable variables, but that MCMC should 
perform resampling of the variables at a rate of 1/sqrt(n).

frequencies_model a = do {
   let {n_letters = alphabetSize a;
        letters = alphabet_letters a};
   SamplingRate (1.0/sqrt(intToDouble n_letters)) $ do {
      pi <- dirichlet' n_letters 1.0;
      sequence_ $ zipWith (\p l -> Log l p) pi letters;
      return pi
   }
};

So, the monad here is mainly used to tag variables sampled in a given 
scope.  It might also be used to specify mcmc transition kernels for 
these variables.

I'm not quite sure how to handle scopes outside monads yet, but Venture 
does something like that, and it doesn't use monads.

3. The above code is "mochastic" because we perform the distribution in 
a monadic interpreter in order to sample from it.  But we could also do 
something non-monadic like

let x=sample $ normal 0 1
      y=sample $ gamma 1 1
in x*y

But Haskell itself isn't a monad.  For example, x and y aren't 
sequenced.  Are there Haskell libraries to "run" Haskell programs in 
different interpreters?

4. The low-level C++ interpreter for haskell in bali-phy is not 
monadic.  Its basically a C++ implementation of something like the STG 
machine, but modified to do incremental computation for generic 
Haskell.  I actually used Sestoft (1997) "Deriving a Lazy Abstract 
Machine" instead of the STG.  I guess I could have tried to do this by 
modifying GHCi to track execution traces.  I'm not really sure if that 
would be be a good idea or not.  Any thoughts?

> More generally: how deep can the interoperability between your project
> and some other Haskell code go?
Hmm.. well I guess we could implement *sampling* from the above monad in 
GHC just by making the interpreter call unsafePerformIO when it sees a 
probability distribution.  But implementing MCMC inference would require 
reimplementing incremental computation for Haskell in GHC.

> These execution traces depend on some number of (random) variables,
>> and when one of the variables is changed (during MCMC), the parts of
>> the execution trace that depend on the changed variables are
>> invalidated and recomputed.  This saves work by not recomputing the
>> probability of the new state from scratch.  I can provide DOT graphs
>> of the execution traces if anyone is interested.
> That sounds really cool!
>
>> The problem domain I've been focusing on is evolution of DNA sequences
>> on binary trees.
> I'm very much interested in your project because I often look into
> non-deterministic evolution of some abstract computing devices, and you
> actually seem to be focusing on a similar use case.
Hmm... I guess the idea is to represent a distribution on the possible 
execution histories of non-deterministic haskell programs. So, yeah, 
maybe there is some overlap.

Huh... that actually might work!  Do you handle non-determinism 
probabilistically?  Do you want to condition on observations, or just 
run the abstract machine many times and record the distribution of what 
happens?

-BenRI