[Haskell-cafe] Help on syntactic sugar for combining lazy & strict monads?
Benjamin Redelings
benjamin.redelings at gmail.com
Fri Jul 30 06:55:15 UTC 2021
The idea of changing observation to look like `Observation a -> Dist a
-> Dist a` is interesting, but I am not sure if this works in practice.
Generally you cannot actually produce an exact sample from a
distribution plus an observation. MCMC, for example, produces
collections of samples that you can average against, and the error
decreases as the number of samples increases. But you can't generate a
single point that is a sample from the posterior.
Maybe it would be possible to change use separate types for
distributions from which you cannot directly sample? Something like
`Observation a -> SampleableDist -> NonsampleableDist a`.
I will think about whether this would solve the problem with laziness...
-BenRI
On 7/29/21 11:35 PM, Benjamin Redelings wrote:
> Hi Olaf,
>
> I think you need to look at two things:
>
> 1. The Giry monad, and how it deals with continuous spaces.
>
> 2. The paper "Practical Probabilistic Programming with Monads" -
> https://doi.org/10.1145/2804302.2804317
>
> Also, observing 2.0 from a continuous distribution is not nonsensical.
>
> -BenRI
>
> On 7/21/21 11:15 PM, Olaf Klinke wrote:
>>> However, a lazy interpreter causes problems when trying to introduce
>>> *observation* statements (aka conditioning statements) into the monad
>>> [3]. For example,
>>>
>>> run_lazy $ do
>>> x <- normal 0 1
>>> y <- normal x 1
>>> z <- normal y 1
>>> 2.0 `observe_from` normal z 1
>>> return y
>>>
>>> In the above code fragment, y will be forced because it is returned,
>>> and
>>> y will force x. However, the "observe_from" statement will never be
>>> forced, because it does not produce a result that is demanded.
>>
>> I'm very confused. If the observe_from statement is never demanded,
>> then what semantics should it have? What is the type of observe_from?
>> It seems it is
>> a -> m a -> m ()
>> for whatever monad m you are using. But conditioning usually is a
>> function
>> Observation a -> Dist a -> Dist a
>> so you must use the result of the conditioning somehow. And isn't the
>> principle of Monte Carlo to approximate the posterior by sampling
>> from it? I tend to agree with your suggestion that observations and
>> sampling can not be mixed (in the same do-notation) but the latter
>> have to be collected in a prior, then conditioned by an observation.
>>
>> What is the semantic connection between your sample and obsersvation
>> monad? What is the connection between both and the semantic
>> probability distributions? I claim that once you have typed
>> everything, it becomes clear where the problem is.
>>
>> Olaf
>>
>> P.S. It has always bugged me that probabilists use elements and
>> events interchangingly, while this can only be done on discrete
>> spaces. So above I would rather like to write
>> (2.0==) `observe_from` (normal 0 1)
>> which still is a non-sensical statement if (normal 0 1) is a
>> continuous distribution where each point set has probability zero.
More information about the Haskell-Cafe
mailing list