[Haskell-cafe] About the lazy pattern
Ryan Ingram
ryani.spam at gmail.com
Thu May 28 12:52:18 EDT 2009
The main change I would make is to rename the arguments to
client/server; they overload the same names (reqs/resps) as the top
level declarations above, so it's very easy to get confused while
reading it.
Partially, I think this is just a hard concept to understand;
struggling to figure it out definitely helped me understand the
language in general.
One idea is to draw a graph showing the cycle of dependences: client
-> resps, resps -> server, server -> reqs, reqs -> client, and
explaining that we break the dependency cycle in two ways:
(1) the "init" parameter to client allows "reqs" to output its first
message before any processing has done
(2) but in order to do so, it needs to be able to delay the pattern match.
Perhaps rewrite client first in this way:
client init rsps = init : (rest rsps) where
rest (r:rs) = client (next r) rs
then show the additional syntactic sugar for delaying the pattern
match until needed?
It also might help to make the example concrete by implementing a
simple "next" and "process".
-- ryan
On Thu, May 28, 2009 at 4:18 AM, Petr Pudlak <deb at pudlak.name> wrote:
> Hi Ryan, thanks for a nice and thorough explanation. I had trouble
> understanding the section of the tutorial as well. Maybe it would deserve to
> rewrite to something a bit simpler?
>
> Anyhow, I'd like to ask: Is there a reason for which pattern matching for
> single-constructor data types isn't lazy by default? For example, why do I have
> to write
>
>> weird ~(x,y) = (1,x)
>> crazy = weird crazy
>
> when the compiler knows that the type (x,y) has just the single constructor,
> and could automatically introduce the additional laziness without any risk?
>
> Petr
>
>> Hi nemo. I had a lot of trouble with that section of the tutorial as
>> well, and I believe that once you get it, a lot of Haskell becomes a
>> lot simpler.
>>
>> The way I eventually figured it out is using this idealized execution
>> model for Haskell: you just work by rewriting the left side of a
>> function to its right side. The only question is figuring out *which*
>> function to rewrite!
>>
>> Here's a simpler example:
>>
>> > f 0 = 0
>> > f x = x+1
>>
>> > g (x:xs) = error "urk"
>> > g [] = 2
>> > const a b = a
>>
>> > ex1 = const (f 1) (g [2,3])
>> > ex2 = f (const (g []) (g [1,2]))
>>
>> Lets say you wanted to know the value of ex1; you just use rewriting
>>
>> ex1
>> -- rewrite using ex1
>> = const (f 1) (g [2,3])
>> -- rewrite using const
>> = f 1
>> -- rewrite using f (second pattern)
>> = 1+1
>> -- rewrite using +
>> = 2
>>
>> But lets say we picked a different order to rewrite...
>>
>> ex1
>> -- rewrite using ex1
>> = const (f 1) (g [2,3])
>> -- rewrite using g
>> = const (f 1) (error "urk")
>> -- rewrite using error
>> computation stops
>>
>> Of course this is bad, and it was obviously wrong to evaluate g first
>> (because const is lazy). So one heuristic we always use is "rewrite
>> the leftmost application first" which avoids this problem. But lets
>> try that rule on ex2!
>>
>> ex2
>> -- rewrite using ex2
>> = f (const (g []) (g [1,2]))
>> -- rewrite using f
>> = ?
>>
>> Unfortunately, now we don't know which pattern to match! So we have
>> to pick a different thing to rewrite. The next rule is that, if the
>> thing you want to rewrite has a pattern match, look at the argument
>> for the patterns being matched and rewrite them instead, using the
>> same "leftmost first" rule:
>>
>> f (const (g []) (g [1,2]))
>> -- trying to pattern match f's first argument
>> -- rewrite using const
>> = f (g [])
>> -- still pattern matching
>> -- rewrite using g
>> = f 2
>> -- now we can match
>> -- rewrite using f (second pattern)
>> = 2+1
>> -- rewrite using +
>> = 3
>>
>> So, back to the original question (I'm rewriting the arguments to
>> "client" and "server" for clarity)
>>
>> > reqs = client init resps
>> > resps = server reqs
>> > client v (rsp:rsps) = v : client (next rsp) rsps
>> > server (rq:rqs) = process rq : server rqs
>>
>> Lets say we are trying to figure out the value of "resps", to print
>> all the responses to the screen:
>>
>> resps
>> -- rewrite using resps
>> = server reqs
>> -- pattern match in server
>> -- rewrite reqs
>> = server (client init resps)
>> -- pattern match in server
>> -- pattern match also in client
>> -- rewrite using resps
>> = server (client init (server reqs))
>> -- pattern match in server, client, then server
>> -- rewrite using reqs
>> = server (client init (server (client init resps)))
>>
>> You see that we are in a loop now; we are stuck trying to pattern
>> match and we will never make any progress!
>>
>> The "lazy pattern" says "trust me, this pattern will match, you can
>> call me on it later if it doesn't!"
>>
>> > reqs = client init resps
>> > resps = server reqs
>> > client v (rsp:rsps) = v : client (next rsp) rsps
>> > server (rq:rqs) = process rq : server rqs
>>
>> resps
>> -- rewrite resps
>> = server reqs
>> -- server wants to pattern match, rewrite reqs
>> = server (client init resps)
>> -- Now, the lazy pattern match says "this will match, wait until later"
>> -- rewrite using client!
>> = let (rsp:rsps) = resps
>> in server (init : client (next rq) rqs)
>> -- rewrite using server
>> = let (rsp:rsps) = resps
>> in process init : server (client (next rq) rqs)
>>
>> We now have a list node, so we can print the first element and
>> continue (which requires us to know the code for "process" and "next",
>> but you get the idea, I hope!)
>>
>> Now of course, you can lie to the pattern matcher:
>>
>> > next x = x + 1
>> > init = 5
>>
>> client init []
>> -- rewrite using client
>> = let (rsp0:rsps0) = []
>> in init : client (next rsp0) rsps0
>> -- rewrite using init
>> = let (rsp0:rsps0) = []
>> in 5 : client (next rsp0) rsps0
>>
>> -- print 5 and continue to evaluate...
>> let (rsp0:rsps0) = []
>> in client (next rsp0) rsps0
>> -- rewrite using client
>> = let
>> (rsp0:rsps0) = []
>> (rsp1:rsps1) = rsps0
>> in (next rsp0) : client (next rsp1) rsps1
>> -- rewrite using next
>> = let
>> (rsp0:rsps0) = []
>> (rsp1:rsps1) = rsps0
>> in rsp0+1 : client (next rsp1) rsps1
>> -- + wants to "pattern match" on its first argument
>> -- rewrite using rsp0
>> computation stops, pattern match failure, (rsp0:rsps0) does not match []
>>
>> For this reason, many people (myself included) consider it bad style
>> to use lazy/irrefutable pattern matches on data types with more than
>> one constructure. But they are very handy on types with a single
>> constructor, like pairs:
>>
>> > weird ~(x,y) = (1,x)
>> > crazy = weird crazy
>>
>> crazy
>> -- rewrite crazy
>> = weird crazy
>> -- rewrite weird
>> = let (x0,y0) = crazy
>> in (1, x0)
>> -- We really want to know what x0 is now.
>> -- But it is the result of a pattern match inside a let; so we need to evaluate
>> -- the right hand side of the binding to see if the patterns match.
>> -- So, rewrite crazy to attempt to pattern match...
>> = let (x0, y0) = weird crazy
>> in (1, x0)
>> -- Then, rewrite weird
>> = let
>> (x1, y1) = crazy
>> (x0, y0) = (1, x1)
>> in (1, x0)
>> -- rewrite x0
>> = let
>> (x1, y1) = crazy
>> (x0, y0) = (1, x1)
>> in (1, 1)
>> -- garbage collect
>> = (1,1)
>>
>> I hope this helps!
>>
>> -- ryan
>
More information about the Haskell-Cafe
mailing list