Sat Jan 13 21:45:02 UTC 2018

```Long story short: You can stay inside Haskell's type system or extend it, but you end up with monads either way. I'll explain why.

> (a) why are monads a perfect solution to my problem?
Two reasons. First: Probability distributions do form a monad [Giry,CS,Kock]. It's a mathematical fact. So why not exploit it? It gives you powerful combinators and powerful abstractions.
Second: Because the requirements on the type modificators [N] and [D] are describing many features monads have. First observe that you don't really need [D] because any deterministic computation can be embedded into the non-deterministic computations by making the set of possibilities a singleton. That is precisely what 'return' does in the context of monads. (Think of return for the list monad.)
> any function that takes a non-deterministic input must have a non-deterministic output.
That is common with monads, too: There is no generic function that can extract values out of monads.
> we should merge variables with identical values if the types are non-deterministic.
That is problematic on several levels. First, is "identical values" decidable by the compiler? Second, equality might depend on the implementation, which might change. For example, (normal 0 1) could contain a random number generator with an intrinsic state.
> sample:: (()->a[N]) -> a[N]
Observe that, disregarding non-terminating computations, any type t is isomorphic to () -> t. Knowing this, your statement seems to imply that a non-deterministic computation is identified with what you can sample from it.
> iid:: Int -> (() -> a[N]) -> [a[N]]
The function Control.Monad.sequence plays this role for monads, with a slightly different type signature.
Control.Monad.sequence . Data.List.repeat :: Monad m => m a -> m [a]
Let the name not deceive you: It is not about sequential computations, rather about ordering the random variables sequentially.

> (b) can we extend the HM type system to support non-determinism directly?
Yes, you can. But then you have a different language. There many publications describing that [Draheim,Lago,Borgström,RP]. In a nutshell, start with the simply typed lambda calculus and add an operation

choose :: I -> a -> a -> a

where I is the the type of real numbers 0 <= x <= 1. 'choose' makes a weighted probabilistic choice between its second and third argument.
However, in order to describe what programs in (lambda calculus + choose) actually _mean_, you need two things:
(1) Define what the compiled program should do at runtime (operational semantics).
(2) Define what the program means, mathematically (denotational semantics).

For (1), the prevailing approach seems to be to equate a probabilistic term with its behaviour under sampling. Markov Chains are popular [MCMC]. I read it's non-trivial, however, to find a Markov chain that behaves according to a mathematically defined probability distribution. The probability that you know more about that than I do is 1.

For (2), the mathematical meaning of ordinary Haskell programs are given via the following mapping. Every Haskell type t is associated with a domain D [DOM] and each Haskell function of type t -> t' is associated with a mathematical function f: D -> D' between the associated domains. Whenever non-determinism is involved, e.g. a probabilistic computation on type t, then instead of D one uses P(D), a suitable "powerdomain". There are various P for various sorts of non-determinism (see the work of Gordon Plotkin), and each of them is a monad on the category of domains. A major problem is whether the P for probabilisitc choice works well with e.g. function types. That is why many papers restrict themselves to first-order functions. Another notoriously hard problem is to combine different sorts of non-determinism. It is like combining monads in Haskell: Some monads have monad transformers associated with them, but some don't.

Finally, you might want to play with non-determinism other than the probabilistic one. For example, there is the infinite-search package on hackage, which provides a monad of plain non-deterministic choice beyond finite lists. It is even possible to define a Haskell probability monad in the same spirit. I can provide some code if you wish.

Regards,
Olaf

References
[Giry] http://dx.doi.org/10.1007/BFb0092872
[CS] http://dx.doi.org/10.1007/s10485-013-9324-9
[Kock] http://www.tac.mta.ca/tac/volumes/26/4/26-04.pdf
[Draheim] http://www.springer.com/de/book/9783642551970
[Lago] https://arxiv.org/abs/1104.0195
[Borgström] https://arxiv.org/abs/1512.08990