[Haskell-cafe] Re: implementing recursive let

Sat Nov 28 18:12:01 EST 2009

Hi Ryan,

first, to get this out of the way, you wrote:

> Also, your definition of "Function" seems to have problems with
> scoping; unless you intended to make a dynamically scoped language,

No, absolutely not! In fact, the whole exercise has been born out of
frustration with certain ad-hoc extensions to an already evil
domain-specific (macro substitution) language -- the extension being to add
dynamically scoped local variables; and the basic evilness to allow
substitution to occur in variable names (similar to make) as a poor man's
substitute for functional abstraction. This makes for extremely cryptic
programs whose result is very hard to predict. My aim is to show that there
is a better way.

> (Value -> Eval Value) seems very likely to get evaluated in the
> context it is called in.

Fortunately, this is not the case, as I explicitly capture the evironment at
the definition site, ignoring the one at the call site:

eval (Lam parm body) = do
  env <- ask
  return $ Function (\val -> local (\_ -> M.insert parm val env) (eval
body))

Now to the interesting part:

> Now the question is, what do you want to happen when given a malformed
> let expression?  I am pretty sure that you need more complicated
> flow-control here in order to get the result.  I believe you are on
> the right track with continuations.

My problem is that I have never really become comfortable with
continuations; just couldn't wrap my head around all the nested lambdas
involved. Is there a nice tutorial (preferably one of those functional
perls, I love them) that explains how CPS actually works to produce those
wonderful effects, like jumping around, fixing evaluation order and
whatnot? I tried to follow the recent explanations by Jacques Carette and
Oleg Kiselyov on this list but I must admit that I understood nought.

> Here is a question; what should these expressions do?
>> let y = x; x = 1 in y
>> let y = x x; x = 1 in x
>> let x = x in x
> 
> The last one is quite telling; I can see three possible behaviors here:
> 
> 1) Loop
> 2) return some simple undefined value
> 3) Give an error "blackhole"
> 
> I will note that behavior (1) seems very difficult to achieve with
> your current monad stack; eval (Var x) terminates simply by looking up
> the value in the environment.
> 
> I think you need to think hard about evaluation order and decide what
> you really want to happen.  The simplest answer, if you want to stay
> with strict evaluation, is probably to only allow recursive *function*
> definitions.  This way you can delay fully initializing the
> environment until after you've finished evaluating the functions
> themselves.

Thanks, Ryan. This got me thinking about the right questions. I found out
that what I really want is a mixture of lazy and strict evaluation: I want
variable definitions in a let expression to be lazy, but application of
functions to be strict. (I don't know whether this kind of mixture has been
used before.) Thus

>> let y = x; x = 1 in y

should evaluate to  1 . I want the meaning of declarations on the same level
to be independent of their relative order. This is a purely functional
language, after all, so why should it matter in which order things are
defined?

>> let y = x x; x = 1 in x

Here  y  is never used, so again this evaluates to  1 .

>> let x = x in x

This should loop (or maybe better detected as a failure i.e. backhole), but
only if and when x is used, either in an application or as the final result
of the program. (In the former case it doesn't make a difference whether  x 
is used in function or in argument position.)

  ***********

Thinking about how to make it _explicit_ in my code that application is
strict, whereas variables are lazy, I saw that this needs a change in the
type of environments. It used to be a map from variable names to _values_,
i.e. evaluated expressions. If I change this to a map from variable names
to either thunks (i.e. unevaluated expressions) or (evaluated) values, then
everything else falls smoothly into place; no need for mdo/mfix anymore,
thus no need for fiddling with ErrorT internals to convince it that
variable lookup always succeeds, and last not least all my examples behave
as I expect them to do (see attached code).

So, in a way I /have/ (finally) given up ;-) because variables are now
(internally) mutable cells: when a variable is demanded (e.g. by an
application) it gets mutated from thunk to value. Could as well revert to a
Reader monad and use STRefs for efficiency. (Or maybe I will finally try to
understand how to use continuations for stuff like this.)

I have learned (at least) this: The problem with using the host language's
lazyness for implementing lazyness in the defined language is that the
former is not directly observable. Thus it works fine as long as you buy
the whole package, i.e. either make sure that there can't be a failure, or
else use not only the built-in evaluation order but also the built-in
failure mode: error, pattern match failure, i.e. exceptions that can occur
in pure code. This has the disadvantage that you can only handle such
failures in the IO monad. Adding an explicit failure mode (as a value, i.e.
Either, ErrorT, whatever) is not really compatible with relying on the
implicit built-in lazyness: it makes the outcome hard to predict and the
whole thing becomes fragile.

This is interesting insofar as normally people stumble over the 'dual'
problem: they find out that exceptions sometimes do not play nice with
lazyness (e.g. exception handlers don't fire because some code gets
evaluated only much later, etc).

Cheers
Ben
-------------- next part --------------
{-# LANGUAGE RecursiveDo, GeneralizedNewtypeDeriving,
TypeSynonymInstances, MultiParamTypeClasses #-}
import Control.Monad
import Control.Monad.State
import Control.Monad.Error
import Control.Monad.Writer
import Control.Monad.Reader
import qualified Data.Map as M

data Expr = Let [(String, Expr)] Expr | Const Int | Var String
          | Lam String Expr | App Expr Expr

data Value = Data String | Function (Value -> Eval Value)

instance Show Value where
    show (Data s) = s

data Definition = Thunk Expr | Value Value

type Env = M.Map String Definition

eval :: Expr -> Eval Value
eval (Const n) = return (Data (show n))
eval (Var x) = do
  env <- get
  case M.lookup x env of
    Just (Thunk expr) -> do
      val <- eval expr
      put (M.insert x (Value val) env)
      return val
    Just (Value val) -> return val
    Nothing -> do
      warning ("reference to undefined variable " ++ show x)
      let val = Data "<undefined>"
      put (M.insert x (Value val) env)
      return val
eval (Let decls body) = do
  let define (name,expr) = (name, Thunk expr)
       -- no evaluation yet, i.e. lazy variable definition
      updateEnv env = foldr (uncurry M.insert) env $ map define decls
  local updateEnv $ eval body
eval (Lam parm body) = do
  env <- ask
  return $ Function (\val -> local (\_ -> M.insert parm (Value val) env) (eval body))
eval (App fun arg) = do
  f <- eval fun
  case f of
    Function f -> do
      eval arg >>= f -- call-by-value, i.e. application is strict
    Data s ->
      throwError $ "application of non-function `" ++ s ++ "`"

warning s = tell $ ["Warning: " ++ s]

newtype Eval a = Eval {
    unEval :: ErrorT String (StateT Env (Writer [String])) a
  } deriving (
    Monad, MonadWriter [String],
    MonadState Env, MonadError String
  )

runEval :: Eval Value -> Either String Value
runEval = fst . runWriter . flip evalStateT M.empty . runErrorT . unEval

evaluate = runEval . eval

instance MonadReader Env Eval where
  ask = get
  local tr act = do
    s <- get
    modify tr
    r <- act
    put s
    return r

-- examples

good1 = Let [("x", Const 1)] (Var "x")
-- Right 1
good2 = Let [("y", Var "x"),("x", Const 1)] (Var "y")
-- Right 1
undef1 = Let [("x", Const 1)] (Var "y")
-- Right <undefined>
letf = Let [("f",Lam "x" (Var "x"))] (App (Var "f") (Const 1))
-- Right 1
badapp = Let [("f",Lam "x" (Var "x"))] (App (Const 1) (Var "f"))
-- Left "application of non-function `1`"
undef2 = Let [("x", Var "x")] (Var "y")
-- Right <undefined>
bottom1 = Let [("x", Var "x")] (Var "x")
-- loops in ghci