What I learned from my first serious attempt low-level Haskell programming

Simon Peyton-Jones simonpj at microsoft.com
Thu Apr 5 03:15:07 EDT 2007

| 5. State# threads clog the optimizer quite effectively.  Replacing
|    st(n-1)# with realWorld# everywhere I could count on data
|    dependencies to do the same job doubled performance.

The idea is that the optimiser should allow you to write at a high level, and do the book keeping for you.  When it doesn't, I like to know, and preferably fix.

If you had a moment to boil out a small, reproducible example of this kind of optimisation failure (with as few dependencies as poss), then I'll look to see if the optimiser can be cleverer.

| 6. The inliner is a bit too greedy.  Removing the slow-path code from
|    singleton doesn't help because popSingleton is only used once; but
|    if I explicitly {-# NOINLINE popSingleton #-}, the code for
|    singleton itself becomes much smaller, and inlinable (15% perf
|    gain).  Plus the new singleton doesn't allocate memory, so I can
|    use even MORE realWorld#s.

That's a hard one!  Inlining functions that are called just once is a huge win usually. I don't know how to spot what you did in an automated way.



More information about the Libraries mailing list