Blake Rain blake.rain at gmail.com
Mon Jan 17 08:45:04 CET 2011

```Dear Haskellers,

I thought I'd take some time to share something of my weekend with
you  all.   Not because  of  anything new,  but  because  it is  a
morning, perhaps.

What  feels  like  decades  ago,   I  used  to  work  in  computer
graphics.   Mostly    my   work   was   on    software   for   the
amateur/semi-professional  film   industry  -  developing  various
filters and effects for the likes of Adobe AfterEffects and so on.

I  have   not  touched   computer  graphics  from   a  programming
perspective for quite a long time  now, and I thought it was about
time that  I flexed those  old muscles. So,  I have taken  it upon
myself to write an SVG renderer for Blender.

To cut what is going to be  a very long and humdrum story short, I
needed to write some matrix arithmetic code.

Having not done  this in a long  time, I thought I'd see  if I can
still actually remember what matrix multiplication is by writing a
function to multiply two 4x4 matrices.  I thought I'd write one in
C and then one in Haskell, and see how long they took and how easy
they were to write.

Here is the one in C:

void multiplyM4 (float m1[4][4], float m2[4][4], float m3[4][4]) {
m1[0][0] = m2[0][0] * m3[0][0] + m2[0][1] * m3[1][0] + m2[0][2] *
m3[2][0] + m2[0][3] * m3[3][0];
m1[0][1] = m2[0][0] * m3[0][1] + m2[0][1] * m3[1][1] + m2[0][2] *
m3[2][1] + m2[0][3] * m3[3][1];
m1[0][2] = m2[0][0] * m3[0][2] + m2[0][1] * m3[1][2] + m2[0][2] *
m3[2][2] + m2[0][3] * m3[3][2];
m1[0][3] = m2[0][0] * m3[0][3] + m2[0][1] * m3[1][3] + m2[0][2] *
m3[2][3] + m2[0][3] * m3[3][3];

m1[1][0] = m2[1][0] * m3[0][0] + m2[1][1] * m3[1][0] + m2[1][2] *
m3[2][0] + m2[1][3] * m3[3][0];
m1[1][1] = m2[1][0] * m3[0][1] + m2[1][1] * m3[1][1] + m2[1][2] *
m3[2][1] + m2[1][3] * m3[3][1];
m1[1][2] = m2[1][0] * m3[0][2] + m2[1][1] * m3[1][2] + m2[1][2] *
m3[2][2] + m2[1][3] * m3[3][2];
m1[1][3] = m2[1][0] * m3[0][3] + m2[1][1] * m3[1][3] + m2[1][2] *
m3[2][3] + m2[1][3] * m3[3][3];

m1[2][0] = m2[2][0] * m3[0][0] + m2[2][1] * m3[1][0] + m2[2][2] *
m3[2][0] + m2[2][3] * m3[3][0];
m1[2][1] = m2[2][0] * m3[0][1] + m2[2][1] * m3[1][1] + m2[2][2] *
m3[2][1] + m2[2][3] * m3[3][1];
m1[2][2] = m2[2][0] * m3[0][2] + m2[2][1] * m3[1][2] + m2[2][2] *
m3[2][2] + m2[2][3] * m3[3][2];
m1[2][3] = m2[2][0] * m3[0][3] + m2[2][1] * m3[1][3] + m2[2][2] *
m3[2][3] + m2[2][3] * m3[3][3];

m1[3][0] = m2[3][0] * m3[0][0] + m2[3][1] * m3[1][0] + m2[3][2] *
m3[2][0] + m2[3][3] * m3[3][0];
m1[3][1] = m2[3][0] * m3[0][1] + m2[3][1] * m3[1][1] + m2[3][2] *
m3[2][1] + m2[3][3] * m3[3][1];
m1[3][2] = m2[3][0] * m3[0][2] + m2[3][1] * m3[1][2] + m2[3][2] *
m3[2][2] + m2[3][3] * m3[3][2];
m1[3][3] = m2[3][0] * m3[0][3] + m2[3][1] * m3[1][3] + m2[3][2] *
m3[2][3] + m2[3][3] * m3[3][3];
}

Urgh! This took me over 20  minutes to write out, and a further 34
minutes  of  debugging.   That's  54  minutes to  write  a  simple
function!   This included 23  invocations of  the compiler  and 18
invocations of  the test program.  I was disgusted with  myself! I
should have automated the writing of it with Haskell!

The  reason for this  was mostly  that I  kept pressing  the wrong
buttons  when writing those  many, many  array indices.   I became
blind  to the  numbers in  the  end, and  it became  one of  those
nightmare  debugging  sessions where  you  check your  printMatrix
function and  other stupid things.  I'd  like to say that  I was a
bit tired, but that's just an excuse.

Some points about the C version:

1. It's only 21 lines of code.

2. It  took me 54 minutes  to write and debug.   Therefore I can
write eight of these in a standard working day. Hmm.

3. It's written in C,  so everybody and their dog understands it
completely. Honest.

4. It's written  in C, so it can just be  pasted into nearly any
other   language   (because   they're  all   just   augmented
C). Therefore no one needs to understand it.

5. It does not consume much space (in terms of memory).

6. It's pretty fast (being written in C):

Performing  100,000,000 matrix  multiples  takes on  average
6.117s (with gcc -O2).

7. Passing a NULL pointer anywhere will cause it to explode.

8. If the pointers are not to arrays of 16 floats, it will trash
memory  (being  as  it  destructively  updates  one  of  it's
arguments).

9. It will only operate on 4x4 float matrices.

10. How am I storing matrices again...? Column major?

So, after drinking  some coffee and having a  smoke, I started the

[foldl (+) 0 \$ zipWith (*) x y | x <- m1, y <- transpose m2]

Ah.

1  It's only  one line  of code  - that's  20 less  than  the C
version.

2.  It took  me just  over five  minutes to  think about  and to
code. That's a saving of  49 minutes (1180% or something). It
compiled straight away and worked first time. I can write too
many of these in a day.

3. It's laughably clear and concise. Anyone can easily tell what
I'm doing.

4. It  does *not*  operate in constant  space. I think.  I can't
tell. I don't know how to tell. Maybe it does.

5. It's very fast (being written in Haskell):

Performing  100,000,000  matrix  multiples takes  on  average
1.435s  (with ghc  -O2).

6. Passing in empty lists will not crash it.

7. It will operate on square matrices of any size (3x3, 4x4...).

8. It will operate on a matrix of any type that is of class Num.

The Haskell  version seems to win  on all counts.

Interestingly  enough (to some at least), other  matrix operations
also become  trivial when expressed  in Haskell:

Determinant is  a bit fiddly,  but quite trivial. However  it is
woefully slow at  O(n!). I need to make  use of LU decomposition
(in which the  determinant is the sum of  the diagonal values in
U).

Adjoint is even more of a fiddle, but actually a one-liner after
reusing  bits   of  the  determinant   function  (including  the
determinant function).

Matrix  inversion is  easy in  one  line, but  dependent on  the
determinant and the adjoint, therefore also woefully slow.

Just another example of Haskell being awesome :)

Now over to you guys! What experiences have you had which make you