[Haskell-cafe] monads once again: a newbie perspective

Thu Aug 24 14:53:03 EDT 2006

Hello!

I' m new to Haskell and try to find my way through types and monads.
I tried the yet Another Haskell Tutorial, very useful for types, but
almost unreadable for monads (that's my perspective!).
Then I discovered that wonderful paper by Wadler (Monads for
functional programming).
So I started "translating it" for someone who can be scared of
something with an abstract and footnotes coming from a professor.

I started writing it in order to clarify to myself this difficult
topic. I think I'm now grasping the concept of monads. 
I thought that someone else could find my writings useful.

It could become a page on the wiki. But before posting there I would
like to have your opinion. Perhaps this is just something unreadable.

Let me know.
Andrea
-------------- next part --------------
An evaluation of Philip Wadler's "Monads for functional programming"
(avail. from http://homepages.inf.ed.ac.uk/wadler/topics/monads.html)

Let's start with something simple: suppose we want to implement a new
programming language. We just finished with Abelson and Sussman's
Structure and Interpretation of Computer Programs
[http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/]
and we want to test what we have learned.

Our programming language will be very simple: it will just compute the
sum operation.
So we have just one primitive operation (Add) that takes to constants
and calculates their sum
For instance, something like:
(Add (Con 5) (Con 6))
should yeld:
11

We will implement our language with the help of a data type
constructor such as:

> module MyMonads where
> data Term = Con Int
>          | Add Term Term
>            deriving (Show)

After that we build our interpreter:

> eval :: Term -> Int
> eval (Con a) = a
> eval (Add a b) = eval a + eval b

That's it. Just an example:

*MyMonads> eval (Add (Con 5) (Con 6))
11

Very very simple. The evaluator checks if its argument is a Cons: if
it is it just returns it.
If it's not a Cons, but it is a Term, it evaluates the right one and
sums the result with the result of the evaluation of the second term.

Now, that's fine, but we'd like to add some features, like providing
some output, to show how the computation was carried out.
Well, but Haskell is a pure functional language, with no side effects,
we were told.
Now we seem to be wanting to create a side effect of the computation,
its output, and be able to stare at it...
If we had some global variable to store the out that would be
simple...
But we can create the output and carry it along the computation,
concatenating it with the old one, and present it at the end of the
evaluation together with the evaluation of the expression!
Simple and neat!

> type MOut a = (a, Output)
> type Output = String
> 
> formatLine :: Term -> Int -> Output
> formatLine t a = "eval (" ++ show t ++ ") <= " ++ show a ++ " - "                                                       
> 
> evalO :: Term -> MOut Int
> evalO (Con a) = (a, formatLine (Con a) a)
> evalO (Add t u) = ((a + b),(x ++ y ++ formatLine (Add t u) (a + b)))
>     where (a, x) = evalO t
>           (b, y) = evalO u

Now we have what we want. But we had to change our evaluator quite a
bit. First we added a function, that takes a Term (of the expression
to be evaluated), an Int (the result of the evaluation) and gives back
an output of type Output (that is a synonymous of String). 

The evaluator changed quite a lot! Now it has a different type: it
takes a Term data type and produces a new type, we called MOut, that
is actually a pair of a variable type a (an Int in our evaluator) and
a type Output, a string.
So our evaluator, now, will take a Term (the type of the expressions
in our new programming language) and will produce a pair, composed of
the result of the evaluation (an Int) and the Output, a string.

So far so good. But what's happening inside the evaluator?
The first part will just return a pair with the number evaluated and
the output formatted by formatLine. 
The second part does something more complicated: it returns a pair
composed by 
1. the result of the evaluation of the right Term summed to the result
of the evaluation of the second Term
2. the output: the concatenation of the output produced by the
evaluation of the right Term, the output produced by the evaluation of
the left Term (each this evaluation returns a pair with the number and
the output), and the formatted output of the evaluation.

Let's try it:
*MyMonads> evalO (Add (Con 5) (Con 6))
(11,"eval (Con 5) <= 5 - eval (Con 6) <= 6 - eval (Add (Con 5) (Con 6)) <= 11 - ")
*MyMonads>

It works! Let's put the output this way:
eval (Con 5) <= 5 - 
eval (Con 6) <= 6 - 
eval (Add (Con 5) (Con 6)) <= 11 -

Great! We are able to produce a side effect of our evaluation and
present it at the end of the computation, after all.

Let's have a closer look at this expression:
evalO (Add t u) = ((a + b),(x ++ y ++ formatLine (Add t u) (a + b)))
     where (a, x) = evalO t
           (b, y) = evalO u

Why all that? The problem is that we need "a" and "b" to calculate their
sum, together with the output coming from their calculation (to be
concatenated by the expression x ++ y ++ formatLine ...).
So we need to separate the pairs produced by "evalO t" and "eval u"
(remember: eval now produces a value of type M Int, i.e. a pair of an
Int and a String!).

Is there a more general way of doing so?

Let's analyze the evaluator from another perspective. From the type
perspective.
We solved our problem by creating a new type, a pair of an Int (the
result of the evaluation) and a String (the output of the process of
evaluation).

The first part of the evaluator does nothing else but creating, from
a value of type Int, an object of type M Int (Int,Output). It does so
by creating a pair with that Int and some text.

The second part evaluates the two Term(s) and "stores" the values thus
produced in some variables to be use later to compute the output.

Let's focus on the "stores" action. The correct term should be
"binds".
Take a function:
f x = x + x
"x" appears on both sides of the expression. We say that on the right
side "x" is bound to the value of x given on the left side.
So
f 3
binds x to 3 for the evaluation of the expression "x + x".
Our evaluator binds "a" and "x" / "b" and "y" with the evaluation of
"eval t" and "eval u" respectively. "a","b","x" and "y" will be then
used in the evaluation of ((a+)(x ++ ...). 
We know that there is an ad hoc operator for binding variables to a
value: lambda, or \.

So we can try to abstract this phenomenon.
What we need is a function that takes our composed type MOut Int and a
function to produce, from that, a new MOut Int, concatenating the
output of the computation of the first with the output of the
computation of the second.
This is what bindM does:

> bindM :: MOut a -> (a -> MOut b) -> MOut b
> bindM m f = (b, x ++ y)
>             where (a, x) = m
>                   (b, y) = f a

It takes:
1. "m": the compound type MOut Int carrying the result of an "eval
Term",
2. a function "f". This function will take the Int extracted by the
evaluation of "m" (see the line above). This function will produce a
new a new pair of: a new Int produced by a new evaluation; some new
output.
bindM will return the new Int in pair with the concatenated outputs
resulting from the evaluation of "m" and "f a".

So let's write the new version of the evaluator:

> evalM_1 :: Term -> MOut Int
> evalM_1 (Con a) = (a, formatLine (Con a) a)
> evalM_1 (Add t u) = bindM (evalM_1 t) (\a -> 
>                                      bindM (evalM_1 u) (\b -> 
>                                                         ((a + b), formatLine (Add t u) (a + b))
>                                                     )
>                                     )

Ugly, isn't it?

Let's start from the outside:
bindM (evalM_1 u) (\b -> ((a + b), formatLine (Add t u) (a + b)))

bindM takes the result of the evaluation "evalM_1 u", a type Mout Int,
and a function. It will extract the Int from that type and use it to
bind "b".
So in bindM (evalM_1 u)... "b" will be bound to a value.

Then the outer part (bindM (evalM_1 t) (\a...) will bind "a" to the
value needed to evaluate "((a+b), formatLine...) and produce our final
MOut Int.

We can write the evaluator in a more convinient way, now that we know
what it does:

> evalM_2 :: Term -> MOut Int
> evalM_2 (Con a) = (a, formatLine (Con a) a)
> evalM_2 (Add t u) = evalM_2 t `bindM` \a ->
>                     evalM_2 u `bindM` \b ->
>                     ((a + b), (formatLine (Add t u) (a + b)))

Now, look at the first part:
evalM_2 (Con a) = (a, formatLine (Con a) a)

We could use a more general way of creating some output.

First we need a method for creating someting of type M a, starting from
something of type a. This is what evalM_2 (Con a) is doing, after all.
Very simply:

> mkM :: a -> MOut a
> mkM a = (a, "")

We then need to "insert" some text (Output) in our type M:

> outPut :: Output -> MOut ()
> outPut x = ((), x)

Very simple: we have a string "x" (Output) and create a pair with a ()
instead of an Int, and the output.

This way we will be able to define also this firts part in terms of
bindM, that will take care of concatenating outputs.

So we have now a new evaluator:

> evalM_3 :: Term -> MOut Int
> evalM_3 (Con a) = outPut (formatLine (Con a) a) `bindM` \_ -> mkM a
> evalM_3 (Add t u) = evalM_3 t `bindM` \a ->
>                    evalM_3 u `bindM` \b ->
>                    outPut (formatLine (Add t u) (a + b)) `bindM` \_ -> mkM (a + b)

Well, this is fine, definetly better then before, anyway.
Still we use `bindM` \_ -> that binds something we do not use (_). We
could write something for this case, when we concatenate computations
without the need of binding variables. Let's call it `combineM`:

> combineM :: MOut a -> MOut b -> MOut b
> combineM m f = m `bindM` \_ -> f

So the new evaluator:

> evalM :: Term -> MOut Int
> evalM (Con a) = outPut (formatLine (Con a) a) `combineM` 
>                   mkM a
> evalM (Add t u) = evalM t `bindM` \a ->
>                   evalM u `bindM` \b ->
>                   outPut (formatLine (Add t u) (a + b)) `combineM` 
>                   mkM (a + b)

Let's put everything together (and change some names):

> type MO a = (a, Out)
> type Out = String

> mkMO :: a -> MO a
> mkMO a = (a, "")

> bindMO :: MO a -> (a -> MO b) -> MO b
> bindMO m f = (b, x ++ y)
>              where (a, x) = m
>                    (b, y) = f a

> combineMO :: MO a -> MO b -> MO b
> combineMO m f = m `bindM` \_ -> f

> outMO :: Out -> MO ()
> outMO x = ((), x)

> evalMO :: Term -> MO Int
> evalMO (Con a) = outMO (formatLine (Con a) a) `combineMO`
>                  mkMO a
> evalMO (Add t u) = evalMO t `bindMO` \a ->
>                    evalMO u `bindMO` \b ->
>                    outMO (formatLine (Add t u) (a + b)) `combineMO` 
>                    mkMO (a + b)

That's it. For today...

(TO BE CONTINUED)

Andrea Rossato
arossato AT istitutocolli.org