[Haskell-cafe] Feeding a monad into itself

Sun Jul 23 21:46:20 UTC 2017

MarLinn, 

you observed well, I swept something under the carpet. So here are the definitions in full glory. Apologies to all the not-so mathematically minded in Haskell-Cafe, the mail has grown rather long. 

A partially ordered set is a type X together with a function
(≤) :: X -> X -> Bool
obeying three laws.
∀ x. x ≤ x
x ≤ y and y ≤ z implies x ≤ z
x ≤ y and y ≤ x implies x == y

Note that total orders, like the Ord typeclass, require that either x ≤ y or y ≤ x hold, but not so partial orders. 
Now suppose you have two partially ordered types, say X and A. By abuse of notation we use ≤ for each of the orders. 
Suppose we have functions 
f :: X -> A
g :: A -> X
Suppose further that f and g are order-preserving, that is
x ≤ x' implies f x ≤ f x'
a ≤ a' implies g a ≤ g a'.
Then f is said to be left adjoint to y if for all x :: X and a :: A,
f x ≤ a precisely when x ≤ g a.

If X is partially ordered, so is each function space (as you suggested correctly), namely pointwise. 
Define for f, g :: Z -> X
f ≤ g when ∀ z. f z ≤ g z

If you like, (->) Z is an applicative functor and the ≤ function for the type Z -> X is built using liftA2 (≤).

A bounded lattice is a partially ordered type L that has two nullary and two binary operations:
least :: L
greatest :: L
min :: L -> L -> L
max :: L -> L -> L
subject to the following laws.
∀ x. least ≤ x
∀ x. x ≤ greatest
z ≤ x and z ≤ y precisely when z ≤ min x y
x ≤ z and y ≤ z precisely when max x y ≤ z

An example is the type Bool where False ≤ True, least = False, greatest = True, min = (&&) and max = (||).
Observe that for each x :: L, the partial application (min x) is a monotone function of type L -> L. 
Hence we can ask when each such function has a right adjoint. If each such right adjoint exists, we call L a Heyting algebra and denote the right adjoint to (min x) by (-->) x. 

For every MonadPlus m, we can embed Bool into m () using the function Control.Monad.guard.
In this reading, False is represented by Nothing and True is represented by Just ().
We thus obtain partial orders on Maybe () as well as on any function type X -> Maybe ().
Recall that my (/\) function had type 
m a -> (a -> m ()) -> m a
Specialising to a = () we obtain

meet = \x y -> x /\ (const y) :: m () -> m () -> m ()

which you can convince yourself produces the (&&) operator in case of Maybe () == Bool. 
Likewise, my arrow (-->) has type
m a -> m a -> a -> m ()
from which we can build

impl = \x y -> (x --> y) () :: m () -> m () -> m ()

Now, in the bounded distributive lattice Bool the right adjoint to (&&) x is the logical implication (→) x and we check

ghci> guard True `impl` guard True :: Maybe ()
Just ()
ghci> guard True `impl` guard False :: Maybe ()
Nothing
ghci> guard False `impl` guard False :: Maybe ()
Just ()
ghci> guard False `impl` guard True :: Maybe ()
Just ()

Summarising, for some MonadPlus monads m, the type m () looks like a bounded lattice with 
least = mzero
greatest = return ()
max = mplus
min = meet
and for a subclass of these monads, including m = Maybe, m () is a Heyting algebra with the right adjoint to min defined as the impl function above. 

On to a more complicated example. Let R = [0,∞] denote the type of non-negative real numbers, including positive infinity, (totally) ordered by the obvious ≤ relation. Observe that for each r :: R, the multiplication function 
(*) r :: R -> R 
is order-preserving if we define 0 * ∞ = 0. Hence we can ask for its right adjoint, which will be a function of type R -> R as well. Exercise: Prove that the right adjoint is division (/) of real numbers, with boundary cases 
r / 0 = ∞ = ∞ / x for all r and x. 
This demonstrates the parallels between logical implication and divison: They are both right adjoints in a certain sense. Is there a monad m with m () = R? Yes: The monad G of measures, a.k.a. the Giry monad. Is is not a Haskell monad, though. There is no Haskell type which has R as denotational semantics. But let's pretend. Let's pretend all sets are finite and of Eq class. Then a measure on X can be encoded by a list of type [(X,R)]. It is what packages like probable do. 
What is (/\) for this monad of measures? Look at the type:
(/\) :: G a -> (a -> R) -> G a
It modifies a measure by a real-valued function. If x :: X previously had the weight r, then the new weight is r * (f x). Obviously, for each measure dx the operator \f -> dx /\ f is order-preserving, since multiplication is monotone. Hence we can ask for a right adjoint. It must be a function of type 
G a -> G a -> (a -> R)
associating a real-valued function to a pair of measures. For a = () we already know the answer: It is division. For all other types, the right adjoint is known as the Radon-Nikodym derivative [*]. It is an important gadget in machine learning because it can be used to build conditional probabilites. Intuitively, you can see the connection: For a conditional probability, you first filter the list [(X,R)] for all the allowed x, then divide by the sum of all remaining weights to make the total weight 1 again. Hence without division there is no conditional probability. 

Moral of the story: Logical implication and conditional probabilities are both instances of the same problem, only for different monads. 

-- Olaf

[*] Again, I am sweeping stuff under the carpet. On function space (X -> R), the order ≤ is defined up to sets of measure zero.