[Haskell-cafe] About the lazy pattern

Thu May 28 07:18:01 EDT 2009

Hi Ryan, thanks for a nice and thorough explanation. I had trouble
understanding the section of the tutorial as well. Maybe it would deserve to
rewrite to something a bit simpler?

Anyhow, I'd like to ask: Is there a reason for which pattern matching for
single-constructor data types isn't lazy by default? For example, why do I have
to write

> weird ~(x,y) = (1,x)
> crazy = weird crazy

when the compiler knows that the type (x,y) has just the single constructor,
and could automatically introduce the additional laziness without any risk?

    Petr

> Hi nemo.  I had a lot of trouble with that section of the tutorial as
> well, and I believe that once you get it, a lot of Haskell becomes a
> lot simpler.
> 
> The way I eventually figured it out is using this idealized execution
> model for Haskell: you just work by rewriting the left side of a
> function to its right side.  The only question is figuring out *which*
> function to rewrite!
> 
> Here's a simpler example:
> 
> > f 0 = 0
> > f x = x+1
> 
> > g (x:xs) = error "urk"
> > g [] = 2
> > const a b = a
> 
> > ex1 = const (f 1) (g [2,3])
> > ex2 = f (const (g []) (g [1,2]))
> 
> Lets say you wanted to know the value of ex1; you just use rewriting
> 
> ex1
> -- rewrite using ex1
> = const (f 1) (g [2,3])
> -- rewrite using const
> = f 1
> -- rewrite using f (second pattern)
> = 1+1
> -- rewrite using +
> = 2
> 
> But lets say we picked a different order to rewrite...
> 
> ex1
> -- rewrite using ex1
> = const (f 1) (g [2,3])
> -- rewrite using g
> = const (f 1) (error "urk")
> -- rewrite using error
> computation stops
> 
> Of course this is bad, and it was obviously wrong to evaluate g first
> (because const is lazy).  So one heuristic we always use is "rewrite
> the leftmost application first" which avoids this problem.  But lets
> try that rule on ex2!
> 
> ex2
> -- rewrite using ex2
> = f (const (g []) (g [1,2]))
> -- rewrite using f
> = ?
> 
> Unfortunately, now we don't know which pattern to match!  So we have
> to pick a different thing to rewrite.  The next rule is that, if the
> thing you want to rewrite has a pattern match, look at the argument
> for the patterns being matched and rewrite them instead, using the
> same "leftmost first" rule:
> 
> f (const (g []) (g [1,2]))
> -- trying to pattern match f's first argument
> -- rewrite using const
> = f (g [])
> -- still pattern matching
> -- rewrite using g
> = f 2
> -- now we can match
> -- rewrite using f (second pattern)
> = 2+1
> -- rewrite using +
> = 3
> 
> So, back to the original question (I'm rewriting the arguments to
> "client" and "server" for clarity)
> 
> > reqs = client init resps
> > resps = server reqs
> > client v (rsp:rsps) = v : client (next rsp) rsps
> > server (rq:rqs) = process rq : server rqs
> 
> Lets say we are trying to figure out the value of "resps", to print
> all the responses to the screen:
> 
> resps
> -- rewrite using resps
> = server reqs
> -- pattern match in server
> -- rewrite reqs
> = server (client init resps)
> -- pattern match in server
> -- pattern match also in client
> -- rewrite using resps
> = server (client init (server reqs))
> -- pattern match in server, client, then server
> -- rewrite using reqs
> = server (client init (server (client init resps)))
> 
> You see that we are in a loop now; we are stuck trying to pattern
> match and we will never make any progress!
> 
> The "lazy pattern" says "trust me, this pattern will match, you can
> call me on it later if it doesn't!"
> 
> > reqs = client init resps
> > resps = server reqs
> > client v (rsp:rsps) = v : client (next rsp) rsps
> > server (rq:rqs) = process rq : server rqs
> 
> resps
> -- rewrite resps
> = server reqs
> -- server wants to pattern match, rewrite reqs
> = server (client init resps)
> -- Now, the lazy pattern match says "this will match, wait until later"
> -- rewrite using client!
> = let (rsp:rsps) = resps
>    in server (init : client (next rq) rqs)
> -- rewrite using server
> = let (rsp:rsps) = resps
>    in process init : server (client (next rq) rqs)
> 
> We now have a list node, so we can print the first element and
> continue (which requires us to know the code for "process" and "next",
> but you get the idea, I hope!)
> 
> Now of course, you can lie to the pattern matcher:
> 
> > next x = x + 1
> > init = 5
> 
> client init []
> -- rewrite using client
> = let (rsp0:rsps0) = []
>    in init : client (next rsp0) rsps0
> -- rewrite using init
> = let (rsp0:rsps0) = []
>    in 5 : client (next rsp0) rsps0
> 
> -- print 5 and continue to evaluate...
>    let (rsp0:rsps0) = []
>    in client (next rsp0) rsps0
> -- rewrite using client
> = let
>     (rsp0:rsps0) = []
>     (rsp1:rsps1) = rsps0
>    in (next rsp0) : client (next rsp1) rsps1
> -- rewrite using next
> = let
>     (rsp0:rsps0) = []
>     (rsp1:rsps1) = rsps0
>    in rsp0+1 : client (next rsp1) rsps1
> -- + wants to "pattern match" on its first argument
> -- rewrite using rsp0
> computation stops, pattern match failure, (rsp0:rsps0) does not match []
> 
> For this reason, many people (myself included) consider it bad style
> to use lazy/irrefutable pattern matches on data types with more than
> one constructure.  But they are very handy on types with a single
> constructor, like pairs:
> 
> > weird ~(x,y) = (1,x)
> > crazy = weird crazy
> 
> crazy
> -- rewrite crazy
> = weird crazy
> -- rewrite weird
> = let (x0,y0) = crazy
>    in (1, x0)
> -- We really want to know what x0 is now.
> -- But it is the result of a pattern match inside a let; so we need to evaluate
> -- the right hand side of the binding to see if the patterns match.
> -- So, rewrite crazy to attempt to pattern match...
> = let (x0, y0) = weird crazy
>    in (1, x0)
> -- Then, rewrite weird
> = let
>     (x1, y1) = crazy
>     (x0, y0) = (1, x1)
>    in (1, x0)
> -- rewrite x0
> = let
>     (x1, y1) = crazy
>     (x0, y0) = (1, x1)
>    in (1, 1)
> -- garbage collect
> = (1,1)
> 
> I hope this helps!
> 
>   -- ryan