[Haskell-cafe] Re: Field names

Tue Sep 9 23:53:19 EDT 2008

Justin Bailey wrote:
> 2008/9/8 Daryoush Mehrtash <dmehrtash at gmail.com>
> > Thanks.
> >
> > Pattern matching and memory management in Haskell (or may be GHC
> > implementation of it) is somewhat of a mystery to me.  Are there
> > any references that explains the underlying implementation?
> 
> Be careful what you ask for. This paper is 16 years old but fairly
> relevant. Click the "view or download" link at the bottom:
> 
>   "Implementing Lazy Functional Languages on Stock Hardware: The
> Spineless Tagless G-Machine"
>   http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.3729

That's an excellent paper for getting down to the gritty details of 
what's going on under the covers. However, I think it's not clear that 
that's what you're really looking for; you needn't know anything about 
the STG in order to know how pattern matching works enough to use it.

In short, a pattern is a (free) variable, or a data constructor applied 
to patterns. So if we have:

  > data MyList = Nil | Cons Int MyList

Then we can have the patterns: Nil, (Cons x xs), (Cons 0 xs),..., (Cons 
x Nil), (Cons 0 Nil),..., (Cons x (Cons x2 xs)), etc.

Other notes:

* As demonstrated above, numeric literals count as "data constructors".

* Since a data constructor can be an infix operator (either spelled with 
backticks or a symbolic name beginning with ':' ) we can also write our 
patterns with infix notation.

* Even though there is an intentional homoiconicity between patterns and 
an expression of data constructors, you can't use arbitrary expressions. 
  This falls out from only allowing data constructors in patterns, 
rather than any arbitrary function.

** In particular, you can't use partial application. You also can't use 
anything like (.), ($), flip,...

** (While n+k patterns exist for legacy reasons, they are an 
abomination. They should not be used and are slated for removal in 
haskell prime.)

** For record syntax this homoiconicity means that you get Foo{xpart=x} 
as a pattern binding the variable x. This follows because that's what 
the expression would look like to construct a Foo setting the xpart to a 
variable x. Perhaps confusingly, the '=' involved here is the one from 
record syntax, not the one from let bindings.

The homoiconicity generally makes code easier to read, though it can be 
somewhat confusing when discussing theoretical concerns. The reason is 
that a single lexeme, e.g. 'Cons', is being used both as a data 
constructor (in expressions) and as a data *de*structor (in patterns). 
Identically, a field name in a record is used both as an injector and as 
a projector. This conceptual overloading is perfectly valid, but it 
sometimes leads to people conflating the ideas which is invalid.

There are sometimes reasons to want to throw a wrench into the works, 
breaking up the homoiconicity. One particular example (which I believe 
will be available in 6.10 though it's not approved for haskell prime) is 
to allow "view patterns". The idea behind view patterns is to allow 
functions to be called behind the scenes in order to convert the 
in-memory representation into a view type, and then do pattern matching 
on that view of the value rather than on the value itself. There are two 
primary uses of this: (1) improving legibility of pattern matching for 
complex datastructures, (2) allowing multiple types to all be pattern 
matched interchangeably, e.g. association lists, Maps, HashMaps,...

There are drawbacks to views (and anything else that breaks 
homoiconicity). First off is that it greatly complicates the story of 
what's going on during pattern matching. More importantly, however, is 
that it means that pattern matches are no longer in correspondence with 
the in-memory representations of values. This means that there is a 
hidden performance cost which can get quite high for deep patterns.

-- 
Live well,
~wren