compilation of pattern-matching?

Lennart Augustsson lennart at
Wed Mar 25 14:41:24 EDT 2009

When you tried switching Nil and Cons, did you try it on many examples?
For a single example a 2-3% could be easily attributed to random
effects like different instruction cache hit patterns.  If you get it
consistently over several programs then it seems likely to mean
something, but I'm not sure what.

On Wed, Mar 25, 2009 at 6:01 PM, Claus Reinke <claus.reinke at> wrote:
>> Indeed GHC does not attempt to retain the order of alternatives, although
>> a) it might be possible to do so by paying more attention in numerous
>> places
>> b) GHC may do so already, by accident, in certain cases
> That adds even more unpredictability. One thing that I don't want whenever I
> have to care about performance is small changes
> having odd/unexplainable effects (I vaguely recall a case where removing an
> unused parameter from a recursion made the program
> slower, or eliminating returned constructors by using continuations
> made one inner-loop function faster, another slower..).
> Lennart is of course right: even if GHC would respect the ordering indicated
> in my source program, I might not be able to tune that source to make good
> use of even a single target processor (I tried defining a foldl over a
> user-defined List type, so that I could switch
> the order of Nil/Cons, and hence the test order used by GHC, and found the
> Nil-before-Cons order to be 2-3% faster for folding a
> very long list than the Cons-before-Nil order I wanted), but it is very
> frustrating if I'm not even given the chance because GHC
> sorts the alternatives, not even according to its own interpretation
> of branching performance, but completely arbitrarily!-)
>> * The issue at stake is a small one: not the *number of tests* but *which
>> tests branch, and which fall through*.
> Right on the issue, but I'm not quite sure how small it is: the test
> case source I attached a few messages ago consistently showed one ordering
> to be 5% faster than the other for the extreme case
> of one test nearly always failing. There may well be more profitable
> optimizations remaining to be implemented first - what disturbs me is that
> Haskell code is full of conditionals and matches, which I tend to arrange
> according to expected frequency, and GHC simply ignores all those hints.
> With the hint about branch prediction, I also found this old ticket (which
> seems to have been waiting for the new backend, and
> indicates rather larger performance differences):
>   Offer control  over branch prediction
>> * Simply ordering the equations doesn't really work, because pattern-match
>> compilation will match an entire column at once:
>>  f (x:xs) True = ...
>>  f []     True = ...
>>  f []     False = ...
>>  f (x:xs) False = ...
>> Which "order" should the (:)/[] test go in?
> In the order indicated in the source!? The usual pattern-match
> optimizations should not change that, they will just skip the two
> '[]' cases if the list isn't empty, or use the constructor tag to jump
> directly to a sub-column. Haskell specifies left-to-right, top-down.
>> * Not only does GHC currently not attempt to retain order, but for a
>> particular order it makes no guarantees about which falls through.  For
>> example, given
>>       case ... of { A -> e1; C -> e2; B -> e3 }
>> We might test for A and then
>> either fall though to e1
>> or     fall through to the test for C
> That is the part I missed, and which might give the UNLIKELY
> pragma, as suggested in #849, more expressive power than
> plain clause ordering. However, since Haskell specifies a match
> order, I don't see why that couldn't be used as the basis for
> mapping to branches as well, with the clauses listed in decreasing
> likelyhood, and GHC generating the branch predictions and fallthroughs to
> match this information to the target processor characteristics?
>> * When the number of constructors is larger, instead of a linear sequence
>> of tests, GHC may generate a table-jump; or a balanced tree of tests.
> The table-jump would make all alternatives equally costly/fast,
> with no penalty for adding infrequent alternatives, right? The
> balanced tree sounds like one of the pattern-match state machines, and there
> would still be room for representing expected frequency in terms of
> tree-path/-rotation/-representation.
>> * Which plan performs best is tremendously architecture dependent, and may
>> well vary a lot between different chips implementing the same instruction
>> set.  It's a losing battle to fix the strategy in source code.
>> * More promising might be to say "this is the hot branch".  That
>> information about frequency could in principle be used by the back end to
>> generate better code.  However, I am unsure how
>>       a) to express this info in source code
>>       b) retain it throughout optimisation
> So it should be specified in the source, after all, just in a way
> that gives programmers room to express their knowledge while
> leaving GHC free to implement that knowledge on the target.
> Things like the UNLIKELY pragma would seem useful, if
> attached to decisions: unless GHC can optimize the whole decision away, it
> will remain throughout optimization, and come out as some form of branch,
> with the hint still attached.
> But UNLIKELY only covers the most common case
> (marking alternatives with minimal expected frequency) -
> if clause ordering was respected, relative frequencies of
> alternatives could be specified without pragmas, just by
> ordering pattern-match or conditional clauses according
> to expected frequency.
>> Claus, if you think this thread is worth capturing, then do write a
>> Commentary page, and I'll check its veracity.
> Given the existence of #849, I've just linked this thread
> from there.
> Claus
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at

More information about the Glasgow-haskell-users mailing list