[Haskell-cafe] Optimizing cellular automata & the beauty of
s.clover at gmail.com
Thu Dec 20 22:42:35 EST 2007
I'm curious how much of the unboxing helped performance and how much
didn't. In my experience playing with this stuff, GHC's strictness
analyzer has consistently been really excellent, given the right
hints. Unboxed tuples are generally a win, but GHC was often smarter
at unboxing ints than I was. Also, higher-order functions can be used
fine, assuming they're not awful recursive, as long as you set GHC's
unfolding threshold high enough. You can also manually force their
inlining, to really get control. I found there tended to be a "sweet
spot" for any given program, where much higher or lower would degrade
performance. As far as removing the word2int# and soforth, you could
probably just use words throughout?
Also, as we discovered the other week, you may want to be careful
with bang patterns, as forcing strictness on already evaluated values
takes some time. Although, SPJ did point out that this was
significantly improved in the new GHC.
But yes, I found that going through and manually unboxing a working
implementation with the help of type errors was actually a sort of
relaxing and zenlike exercise.
On Dec 20, 2007, at 6:07 PM, Justin Bailey wrote:
> I'm back with another version of my cellular automata simulator.
> Since the last iteration I discovered GHC's unlifted types and the
> primitive operations that go with them. Using these types, rather
> than Ints, sped my code up by 2x or more.
> http://hpaste.org/4151#a2 -- first half of program
> http://hpaste.org/4151#a3 -- remaining (note 3 lines or so from
> first half are repeated)
> The key observation came from looking at the Core files, which
> showed a lot of int2word# and word2int# conversions going on.
> Figuring out how to remove those led me to the unlifted types.
> Coding with these types is not easy (for example, I couldn't see a
> way to write a Word# value directly - I had to write stuff like
> "int2Word# 1#"), but because I had an existing algorithm to guide
> me, combined with type checking, it was a fairly straightforward
> At first I felt kind of bad about using these operations, but then
> I noticed they are used pretty heavily by GHC itself. If it's good
> enough for GHC, it's good enough for me. The 2x performance gain
> didn't hurt either. Finally, the safety that comes from using the
> ST monad is just awesome. I've got low-level bit munging combined
> with high-level abstraction where I need it. So cool!
> I was disappointed to find early on that using higher-order
> functions in tight loops was a performance killer. It's unfortunate
> because I started with a very elegant, short implementation based
> on a simple Ring buffer and map. The current version is certainly
> harder to understand and has some weird limitations. However,
> having the simple implementation let me use quickcheck to compare
> their results on random rules and inputs, which gave me high
> confidence that my complex implemenation is correct.
> One thing I absolutely love about this program is its memory
> performance. It manages to stay within 1 - 10 MB of memory,
> depending on how much output is produced. How cool is that?
> On Dec 3, 2007 2:44 AM, Mirko Rahn <rahn at ira.uka.de > wrote:
> It is interesting, that the naive implementation
> is only 3 times slower than your quite complex, hard to follow and
> to debug implementation.
> Now the naive implementation is 100x slower, so I don't feel so bad
> about this comment any more.
> As always, I prefer to write most code in Haskell, quick, easy, nice,
> reasonable fast, ... If speed matters, I switch to some lower level
> language, as you did staying inside Haskell.
> I have to take exception here - I *want* to write my code in
> Haskell. If Haskell isn't fast enough to beat the C implementation,
> I'd rather find a way to make my Haskell program faster than switch
> to some other language.
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
More information about the Haskell-Cafe