[Haskell-cafe] Making monadic code more concise

Mon Nov 15 12:43:14 EST 2010

Hi,

I'm fairly new to Haskell and recently came across some programming
tricks for reducing monadic overhead, and am wondering what
higher-level concepts they map to. It would be great to get some
pointers to related work.

Background:

I'm a graduate student whose research interests include methods for
implementing domain specific languages. Recently, I have been trying
to get more familiar with Haskell and implementing DSLs in it. I'm
coming from having a fair bit of experience in Python so I know the
basics of functional programming.

However, I'm much less familiar with Haskell. In particular I have
little to no internal map from existing DSL implementation techniques
to the Haskell extensions that would no doubt make DSL implementations
easier (and when they are *not* needed).

I also don't have a complete picture of the functional programming
research that would inform these techniques. I would greatly
appreciate it if I could get pointers to the appropriate references so
I can really get going on this.

Specifically: There are some DSLs that can be largely expressed as monads,
that inherently play nicely with expressions on non-monadic values.
We'd like to use the functions that already work on the non-monadic
values for monadic values without calls to liftM all over the place.

The probability monad is a good example.

import Control.Monad
import Data.List

newtype Prob a = Prob { getDist :: [(a, Float)] } deriving (Eq, Show)

multiply :: Prob (Prob a) -> Prob a
multiply (Prob xs) = Prob $ concat $ map multAll xs
	where multAll (Prob innerxs, p) = map (\(x, r) -> (x, p * r)) innerxs

instance Functor Prob where
	fmap f (Prob xs) = Prob $ map (\(x, p) -> (f x, p)) xs

instance Monad Prob where
	return x = Prob [(x, 1.0)]
	x >>= f = multiply (fmap f x)

In this monad, >>= hides the multiplying-out of conditional
probabilities that happen during the composition of a random variable
with a conditional distribution.

coin x = Prob [(1, x), (0, 1 - x)]

test = do
	x <- (coin 0.5)
	y <- (coin 0.5)
	return $ x + y

*Main> test
Prob {getDist = [(2,0.25),(1,0.25),(1,0.25),(0,0.25)]}

We can use a 'sum out' function to get more useful results:

sumOut :: (Ord a) => Prob a -> Prob a
sumOut (Prob xs) = Prob $ map (\kvs -> foldr1 sumTwoPoints kvs) eqValues
	where
		eqValues = groupBy (\x y -> (fst x == fst y)) $ sortBy compare xs
		sumTwoPoints (v1, p1) (v2, p2) = (v1, p1 + p2)

*Main> sumOut test
Prob {getDist = [(0,0.25),(1,0.5),(2,0.25)]}

I'm interested in shortening the description of 'test', as it is
really just a 'formal addition' of random variables. One can use liftM
for that:

test = liftM2 (+) (coin 0.5) (coin 0.5)

It seems what I'm leading into here is making functions on ordinary
values polymorphic over their monadic versions; I think this is the
desire for 'autolifting' or 'monadification' that has been mentioned
in works such as HaRE

http://www.haskell.org/pipermail/haskell/2005-March/015557.html

One alternate way of doing this, however, is instancing the
typeclasses of the ordinary values with their monadic versions:

instance (Num a) => Num (Prob a) where
	(+) = liftM2 (+)
	(*) = liftM2 (*)
	abs = liftM abs
	signum = liftM signum
	fromInteger = return . fromInteger

instance (Fractional a) => Fractional (Prob a) where
	fromRational = return . fromRational
	(/) = liftM2 (/)

Note that already, even though each function in the typeclass had to
be manually lifted, this eliminates more overhead compared to lifting
every function used, because any function with a general enough type
bound can work with both monadic and non-monadic values, not just the
ones in the typeclass:

*Main> sumOut $ (coin 0.5) + (coin 0.5) + (coin 0.5)
Prob {getDist = [(0,0.125),(1,0.375),(2,0.375),(3,0.125)]}
*Main> let foo x y z = (x + y) * z
*Main> sumOut $ foo (coin 0.5) (coin 0.5) (coin 0.5)
Prob {getDist = [(0,0.625),(1,0.25),(2,0.125)]}

Because of the implementation of fromInteger as return . fromInteger,
we also 'luck out' and have the ability to mix ordinary and
non-monadic values in the same expression:

*Main> 1 + coin 0.5 / 2
Prob {getDist = [(1.5,0.5),(1.0,0.5)]}

My question is, what are the higher-level concepts at play here? The
motivation is that it should be possible to automatically do this
typeclass instancing, letting us get the benefits of concise monadic
expressions without manually instancing the typeclasses.

Indeed, I didn't have this idea in Haskell; I'm coming from Python
where one can realize the automatic instances: if we take the view
that classes in Python are combined datatypes and instanced
typeclasses, we can use the meta-object protocol to look inside one
class's representation and output another class with liftM-ed (or
return . -ed) methods and a custom constructor. I realize Template
Haskell gives you the ability to reify instance/class declarations,
but perhaps there's a less heavyweight way (or there should be).

I think a good question as a starting point is whether it's possible
to do this 'monadic instance transformation' for any typeclass, and
whether or not we were lucky to have been able to instance Num so
easily (as Num, Fractional can just be seen as algebras over some base
type plus a coercion function, making them unusually easy to lift if
most typeclasses actually don't fit this description).

In general, what this seems to be is a transformation on functions
that also depends explicitly on type signatures. For example in Num:

class (Eq a, Show a) => Num a where
	(+), (-), (*) :: a -> a -> a
	negate, abs, signum :: a -> a
	fromInteger :: Integer -> a

instance (Num a) => Num (Prob a) where
	(+) = liftM2 (+) -- Prob a -> Prob a -> Prob a
	(*) = liftM2 (*) -- Prob a -> Prob a -> Prob a
	abs = liftM abs -- Prob a -> Prob a
	signum = liftM signum -- Prob a -> Prob a
	fromInteger = return . fromInteger -- Integer -> Prob a

Note that if we consider this in a 'monadification' context, where we
are making some choice for each lifted function, treating it as
entering, exiting, or computing in the monad, instancing the typeclass
leads to very few choices for each: the monadic versions of +, -, *
must be obtained with "liftM2",the monadic versions of negate, abs,
signum must be obtained with "liftM", and the monadic version of
fromInteger must be obtained with "return . "

I think this is what we're doing in general: if we had a typeclass C
with a type variable a, with some set of type signatures in which 'a'
appears bound, we can do "instance C (M a)" for some monad M if there
is some way to realize the resulting set of type signatures where
every bound occurence of 'a' is replaced with 'M a'. The following
script illustrates precisely what I mean by this:

data Typ = Gr String -- Irreducible grounded type, like Int or Bool
	| Con String Typ -- Type constructor applied to a type of kind *,
i.e., IO String, Maybe (Prob Int)
	| Arr Typ Typ -- Function T1 -> T2
	| Tup (Typ, Typ) -- Tuple (T1, T2)
	deriving Eq

instance Show Typ where
	show (Gr a) = a
	show (Con m a) = "(" ++ m ++ " " ++ show a ++ ")"
	show (Arr a b) = "(" ++ show a ++ " -> " ++ show b ++ ")"
	show (Tup (x, y)) = "(" ++ show x ++ ", " ++ show y ++ ")"

--let     a   =  b  in  expr for our type signature calculus
typlet :: Typ -> Typ -> Typ -> Typ
typlet a b expr = case a == expr of
	False -> typletdown a b expr
	True -> b

typletdown a b (Tup (x, y)) = Tup (typlet a b x, typlet a b y)
typletdown a b (Arr x y) = Arr (typlet a b x) (typlet a b y)
typletdown a b (Con str c) = Con str (typlet a b c)
typletdown a b (Gr s) = Gr s

monadify a b expr = typlet a (Con b a) expr

-- class declaration : the free type variable, with a list of signatures
type ClassDecl = (Typ, [Typ])

-- monadify class signatures
monadic_sigs :: ClassDecl -> String -> ClassDecl
monadic_sigs (btyp, sigs) it = (Con it btyp, map (\sig -> monadify
btyp it sig) sigs)

-- Num
mkNum = (Gr "a", [
	(Gr "a") `Arr` ((Gr "a") `Arr` (Gr "a")), -- (+), (-), (*)
	(Gr "a") `Arr` (Gr "a"), -- abs, signum, negate
	(Gr "Integer") `Arr` (Gr "a")]) -- fromInteger

main = do
	let testexpr =(Gr "a") `Arr` ((Gr "a") `Arr` (Gr "a"))
	print "Sample type signature:"
	print testexpr
	print "Monadified:"
	print $ monadify (Gr "a") "Prob" testexpr
	print $ "Num typeclass signatures:"
	print $ mkNum
	print "Signatures of the functions needed for instance (Num a) => Num
(Prob a):"
	print $ monadic_sigs mkNum "Prob"

I suppose I'm basically suggesting that the 'next step' is to somehow
do this calculation of types on real type values, and use an inductive
programming tool like Djinn to realize the type signatures. I think
the general programming technique this is getting at is an orthogonal
version of LISP style where one goes back and forth between types and
functions, rather than data and code. I would also appreciate any
pointers to works in that area.

Thanks for any leads,

Lingfeng Yang
lyang at cs dot stanford dot edu