Bang patterns, ~ patterns, and lazy let

Wed Feb 8 00:56:05 EST 2006

>  
>
From: Simon Peyton-Jones
To: John Hughes ; haskell-prime at haskell.org
Sent: Tuesday, February 07, 2006 11:37 PM
Subject: RE: Bang patterns, ~ patterns, and lazy let

Applying the rules on the wiki, the first step is to translate the first 
expression into a tuple binding, omitting the implicit ~:

Not so! I changed it a few days ago after talking to Ben, to a simpler 
form that works nicely for recursive bindings too. Darn I forgot to 
change the rules at the bottom.

Anyway, read the section “Let and where bindings”. Sorry about the rules 
at the end.

Simon

The trouble with those parts is that NOWHERE do they discuss how to 
translate a let or where containing more than one binding. If they're 
not to be translated via tupling, then how are they to be translated? 
The only relevant thing I could find was in the "modifications to the 
report" section at the bottom, which just tells you to omit implicit ~ 
when applying the tuplling rules in the report.

So I don't understand how the semantics of multiple bindings is supposed 
to be defined (and I admit my proposal isn't so nice either). But more 
and more complex translations make me very nervous!

I have a feeling there could be a nice direct semantics, though, 
including both ! and ~ in a natural way. Let's see now.

Let environments be (unlifted) functions from identifiers to values, 
mapping unbound identifiers to _|_ for simplicity. The semantics of 
patterns is given by

P[[pat]] :: Value -> Maybe Env

The result is Just env if matching succeeds, Nothing if matching fails, 
and _|_ if matching loops.

Two important clauses:

P[[! pat]] v = _|_ if v=_|_

P[[pat]]v otherwise

P[[~ pat]] v = Just _|_ if P[[pat]]v <= Nothing

P[[pat]]v otherwise

In definitions, pattern matching failure is the same as looping, so we 
define

P'[[pat]]v = _|_ if P[[pat]]v = Nothing

P[[pat]]v otherwise

We do need to distinguish, though, between _|_ (match failure or 
looping), and Just _|_ (success, binding all variables to _|_).

The semantics of a definition in an environment is

D[[pat = exp]]env = P'[[pat]] (E[[exp]]env) (*)

where E is the semantics of expressions. Note that this takes care of 
both ! and ~ on the top level of a pattern.

Multiple definitions are interpreted by

D[[d1 ; d2]]env = D[[d1]]env (+) D[[d2]]env

where (+) is defined by

_|_ (+) _ = _|_

Just env (+) _|_ = _|_

Just env (+) Just env' = Just (env |_| env')

Note that (+) is associative and commutative.

Let's introduce an explicit marker for recursive declarations:

D[[rec defs]]env = fix menv'. D[[defs]](env |_| fromJust menv')

Notice:

This ignores the possibility of local variables shadowing variables from 
outer scopes.

*Within defs* it makes no difference whether menv' is _|_ (matching 
fails or loops), or Just _|_ (succeeds with variables bound to _|_)

If defs are not actually recursive, then D[[rec defs]]env = D[[defs]]env.

Now let expressions are defined by

E[[let defs in exp]]env = E[[exp]](env |_| D[[rec defs]]env)

(this also ignores the possibility of local definitions shadowing 
variables from an outer scope).

Too late at night to do it now, but I have the feeling that it should 
not be hard now to prove that

E[[let defs1 in let defs2 in exp]]env = E[[let defs1; defs2 in exp]]env

under suitable assumptions on free variable occurrences. That implies, 
together with commutativity and associativity of (+), that the division 
of declaration groups into strongly connected components does not affect 
semantics.

I like this way of giving semantics--at least I know what it means! But 
it does demand, really, that matching in declarations is strict by 
default. Otherwise I suppose, if one doesn't care about 
compositionality, one could replace definition (*) above by

D[[!pat = exp]]env = P'[[pat]](E[[exp]]env)

D[[pat = exp]]env = P'[[~pat]](E[[exp]]env), otherwise

But this really sucks big-time, doesn't it?

John