[Haskell-cafe] Optimizing cellular automata evaluation (round 2)

Thu Nov 29 19:31:40 EST 2007

I posted awhile back asking for help improving my cellular automata
program. I am competing with a C program which evolves CAs using a
fairly simple genetic algorithm. The algorithm involves evaluating 100
rules on 100 CAs, 100 times. In C this takes about 1 second. In my
Haskell version, it takes about 20 minutes. That's an improvement from
my first attempts, which took 7 hours.

All the time is spent in the function that evaluates one iteration of
a given CA. It examines each cell and its neighbors and, based on the
values found, updates the cell to be white or black. Here is the code,
followed by an explanation of what it's doing. I'm looking for any
suggestions on making this faster. Prettier helps too but not at the
expense of speed.

  http://hpaste.org/4151#a0

I represent the automata as an array of integers, where each bit
represents a cell. Automata 'wrap' around, so marching through the
array and determing the value of all neighbors is tricky around the
edges. My goal was to minimize conditionals as much as possible and to
avoid any "modulo" arithmetic. I wanted to use straightforward array
access.

My implemention uses three functions: fill, fillS and fillE. Each
evalutaion of fill will determine the new values for cells represented
by the current array element. fill handles two conditions - when the
array element being examined is the last element or when it is not. It
calls fillS with appropriate arguments, the results of which become
the value for the integer in the current array position.

fillS then handles two conditions - when the "neighborhood" of the bit
being examined crosses to the "next" array element and when it does
not. When the "neighborhood" crosses to the next array element, fillE
is called. fillE finishes determining the values for the remaining
cells in the integer and returns, ending the current evaluation.

There are some other details in the code (for example firstRule) which
could be made faster and/or cleaner, but their contribution is far
outweighed by the fill function.

If there is one thing I'm happy about, its the memory performance of
this code. It runs in a constant 7 - 10MB of memory, even when I'm
doing 100 generations with 100 rules and 100 CAs. Each CA is evaluated
for 250 steps, which means every generation does 2.5M steps. That kind
of memory usage is pretty sweet.

Thanks in advance to those willing to take the time to look at this code!

Justin