[Haskell-cafe] Optimizing cellular automata & the beauty of unlifted types

Thu Dec 20 18:07:30 EST 2007

I'm back with another version of my cellular automata simulator. Since the
last iteration I discovered GHC's unlifted types and the primitive
operations that go with them. Using these types, rather than Ints, sped my
code up by 2x or more.

http://hpaste.org/4151#a2 -- first half of program
http://hpaste.org/4151#a3 -- remaining (note 3 lines or so from first half
are repeated)

The key observation came from looking at the Core files, which showed a lot
of int2word# and word2int# conversions going on. Figuring out how to remove
those led me to the unlifted types. Coding with these types is not easy (for
example, I couldn't see a way to write a Word# value directly - I had to
write stuff like "int2Word# 1#"), but because I had an existing algorithm to
guide me, combined with type checking, it was a fairly straightforward
implementation.

At first I felt kind of bad about using these operations, but then I noticed
they are used pretty heavily by GHC itself. If it's good enough for GHC,
it's good enough for me. The 2x performance gain didn't hurt either.
Finally, the safety that comes from using the ST monad is just awesome. I've
got low-level bit munging combined with high-level abstraction where I need
it. So cool!

I was disappointed to find early on that using higher-order functions in
tight loops was a performance killer. It's unfortunate because I started
with a very elegant, short implementation based on a simple Ring buffer and
map. The current version is certainly harder to understand and has some
weird limitations. However, having the simple implementation let me use
quickcheck to compare their results on random rules and inputs, which gave
me high confidence that my complex implemenation is correct.

One thing I absolutely love about this program is its memory performance. It
manages to stay within 1 - 10 MB of memory, depending on how much output is
produced. How cool is that?

On Dec 3, 2007 2:44 AM, Mirko Rahn <rahn at ira.uka.de > wrote:

> It is interesting, that the naive implementation
>

...

is only 3 times slower than your quite complex, hard to follow and hard
> to debug implementation.
>

Now the naive implementation is 100x slower, so I don't feel so bad about
this comment any more.

>
> As always, I prefer to write most code in Haskell, quick, easy, nice,
> reasonable fast, ... If speed matters, I switch to some lower level
> language, as you did staying inside Haskell.
>

I have to take exception here - I *want* to write my code in Haskell. If
Haskell isn't fast enough to beat the C implementation, I'd rather find a
way to make my Haskell program faster than switch to some other language.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071220/3961d67b/attachment.htm