compilation of pattern-matching?

Wed Mar 25 22:51:49 EDT 2009

On Wed, 2009-03-25 at 18:01 +0000, Claus Reinke wrote:

> With the hint about branch prediction, I also found this old ticket 
> (which seems to have been waiting for the new backend, and
> indicates rather larger performance differences):
> 
>     Offer control  over branch prediction
>     http://hackage.haskell.org/trac/ghc/ticket/849

Oh, I should have read the whole thread first. I see you found it
already.

Yes, I did find that in the ByteString stream/unstream functions that
essentially arbitrary changes in the logic of branches changed
performance by a factor of two or so. At the time I put it down to basic
block ordering and branch prediction.

A related issue is that we cannot reliably force a call to be
out-of-line (so it doesn't add code to the hot-path) without also
disabling the worker wrapper transform.

> But UNLIKELY only covers the most common case
> (marking alternatives with minimal expected frequency) -
> if clause ordering was respected, relative frequencies of
> alternatives could be specified without pragmas, just by
> ordering pattern-match or conditional clauses according
> to expected frequency.

I think marking expectations (either manually or by profile-directed
feedback) is a more profitable approach. We can end up with nested cases
part way through optimisation that were never there explicitly in the
source, so preserving source order is meaningless there.

For example consider a simple tail recursive loop:

length :: ByteString -> Int
length = go 0
  where
    go !n bs  | null bs    = n
              | otherwise  = go (n+1) (tail bs)

There are no explicit cases here. There is nothing for the programmer to
order. In any case, this is a user-defined function and they don't know
the details of how to optimise this.

After inlining we actually find that there are three cases:

      * the lazy ByteString being Empty
      * it being a non-empty Chunk with 1 byte left it
      * it being a non-empty Chunk with more than 1 byte left

In similar situations it may be profitable to re-nest the cases. But if
we attach likelihoods then that seems more robust than trying to
maintain orderings on cases. In the above example I'd annotate the
definition of tail to indicate that the chunk being length 1 is not
nearly as likely as it not being so. 

Actually in this example the information probably doesn't need to be
given explicitly at all since one branch leads to a recursive call and
the other to a function return. A static model here would be enough, no
hints or profile feedback required.

Duncan