[Haskell-cafe] Lazy Probabilistic Programming System in Haskell

Tue Jul 20 18:57:25 UTC 2021

On 7/17/21 12:29 PM, Olaf Klinke wrote:
>>>
>>> On 7/16/21 8:27 AM, Olaf Klinke wrote:
>>>
>>>> Hi,
>>>>
>>>> My program BAli-Phy implements probabilistic programming with 
>>>> models written as Haskell programs.
>>>>
>>>> http://www.bali-phy.org/models.php
>>>
>>> Dear Benjamin,
>>> last time you announced BAli-Phy I pestered you with questions about
>>> semantics. In the meantime there was a discussion [1] on this list
>>> regarding desirable properties of probabilistic languages and monads in
>>> general. A desirable property of any probabilistic language is that
>>> when you define a distribution but map a constant function over it,
>>> then this has the same computational cost as returning the constant
>>> directly. Can you say anything about that?
>>> Cheers,
>>> Olaf
>>>
>>> [1]
>> https://mail.haskell.org/pipermail/haskell-cafe/2020-November/132905.html 
>>
>>>
>>>
>>>
>>
>> On Fri, 16 Jul 2021, Benjamin Redelings wrote:
>>
>>
>> Hi Olaf,
>>
>> Are you asking if
>>
>>     run $ (const y) <$> normal 0 1
>> has the same cost as
>>
>>     run $ return y
>>
>> for some interpreter `run`?
>>
>> Yes, the cost is the same.  In do-notation, we would have
>>
>>     run $ do
>>         x <- normal 0 1
>>         return $ (const y x)
>>
>> Since `const y` never forces `x`, no time is spent evaluating `run $ 
>> normal 0 1`.  That is basically
>> what I mean by saying that the language is lazy.
>>
>> -BenRI
>>
>
> Awesome! That is something you can not have with (random number-)state 
> based implementations, as far as I know, because
> x <- normal 0 1
> at least splits the random number generator. Hence running the above 
> ten thousand times even without evaluating the x does have a 
> non-neglegible cost. So how did you implement the lazyness you 
> described above? Do you have thunks just like in Haskell?
>
> Last time I remarked that the online documentation contains no proper 
> definition of the model language. Some examples with explanations of 
> individual lines are not enough, IMHO. That appears not to have 
> changed since the last release. So why don't you include a list of 
> keywords and built-in functions and their meaning/semantics?
>
> Regards,
> Olaf

Hi Olaf,

1. My VM is based on Sestoft (1997) "Deriving a lazy abstract machine".  
However, when a closure is changeable, we do not overwrite the closure 
with its results.  Instead we store a allocate a new heap location for 
the result, and record a pointer from the closure to its result.  This 
allows us to erase results when modifiable variables change, while 
retaining reduction steps that do not depend on the modifiable variables.

The VM does have thunks.  This means that there is an O(1) cost for "x 
<- normal 0 1" in the program fragment above.  So, it is more precise to 
say that

     run $ (const y) <$> normal 0 1

has the same cost as

     run $ let x = (run $ normal 0 1) in return y

I think for my purposes, an O(1) cost is fine.  But if you want NO cost 
at all, then I think this would require code optimization to eliminate 
the thunk allocated for x.

2. In response to the question about splitting the random number 
generator, I have a few thoughts.

2a. Splitting the random number generator, should lead to an O(1) cost 
per unforced random sample.  I think an O(1) cost is fine, unless the 
overhead is very high.

2b. In my code, the VM gets random numbers from a runtime library api 
that delivers true random numbers.  This could be implemented by a 
hardware instruction for example.  In practice it is delivered by a 
pseudorandom number generate with its own internal state.  However, for 
my purposes, I think both are fine.

3. I would be happy to add more documentation, if it is actually helpful 
in figuring out the language.  When learning HTML, I did not read the 
formal specification, but modified existing examples.  Are the examples 
actually hard to follow?  I am afraid that if I try and write "formal" 
specifications, then there will always be someone who declares them 
improper and not formal enough.  But if the documentation is simply bad 
(which is probably true), I would be happy to improve it.  Does that 
make sense?

> So why don't you include a list of keywords and built-in functions and 
> their meaning/semantics? 
This seems reasonable.  I will try to do that.

BTW, part of the confusion might stem from the fact that the language 
has two monads: a strict monad for observations, and a lazy monad for 
random sampling.  Perhaps you would have insights on my question in my 
previous e-mail about combining lazy and strict monads?

take care,

-BenRI