[Haskell-cafe] Numerical methods in Haskell

Mon Feb 20 05:47:49 EST 2006

I finally got some time to answer Simon's posting:

Simon P-J:
>| Between google searching and looking through the activity
>| report, I take it that no one has really developed serious
>| libraries for matrix manipulations, diff eqs, etc.
>| 
>| Are there any practical reasons for this or is it just a
>| matter of the haskell community being small and there not
>| being many people interested in something so specialized?
>
>The latter I think, but it's just the sort of thing that a functional
>language should be good at.  Two other difficulties
>
>(a) It's hard to compete with existing libraries.  The obvious thing is
>not to compete; instead, just call them.  But somehow that doesn't seem
>to be as motivating.  Perhaps some bindings exist though?

Hard to compete, yes. But on the other hand, the rewards can be high.
Numerical library code (especially matrix libraries) tends to be highly
optimized for the hardware architecture at hand. As a consequence a small
change in, say, the cache architecture, might require a thorough rewrite of
the library code to regain high utilisation of the hardware. This is since
languages like Fortran and C force you to code so close to the hardware. A
high-level language, with good optimizing compilation, would make also
library code more portable across hardware architectures. N.b. these
optimizations are non-trivial to say the least.

>(b) A concern about efficiency, because numerical computation is
>typically an area where people really care about how many instructions
>you take.  It's a legitimate concern, but I don't think that it'll turn
>out to be justified.  With unboxed arrays, and/or calling external
>libraries for the inner loops -- and the potential for aggressive fusion
>and/or parallelism, there is plenty of upward potential.  I also want to
>work on nested data parallelism (a la NESL, and NEPAL) which fits right
>in here.

The number of instructions is only one side of the coin. For
high-performance computing, memory issues are at least as important: both
the amount of memory used (e.g., will the computation fit into memory at
all), and how the memory hierarchy is utilized (caches, TLB:s, virtual
memory, ...). This is a really sweet spot of functional languages, and
laziness adds to it.

On the other hand, the increased abstraction of functional languages gives
an optimizing compiler larger freedom to reorder computations and choose
memory layouts of data structures like matrices. This is potentially very
useful, since optimizing for memory hierarchy utilization typically involves
both data layout and order of memory accesses. However, to achieve a good
result, the compiler must be able to predict a great deal of the computing
and the memory usage. For instance, dynamic memory handling of numeric data
structures will surely kill any serious attempt to predict the cache
behavior. To achieve good optimizing compilation, we need either very good
program analyses, or a library of recursion patterns or templates for
which the compiler knows how to allocate memory statically and order the
computations well, or possibly both.

Some encouraging examples: Sven-Bodo Scholz has achieved very good
performance for the restricted functional matrix language SAC, using
optimizations for cache. My former student Peter Drakenberg invented a
restricted functional matrix language, with analyses to infer matrix sizes
statically, and sharing analysis, to find opportunities to allocate memory
statically and update in-place. He also got some good experimental figures.
This leads me to believe that compilers in more general languages could do
something similar, by recognizing certain patterns or through advanced
program analyses.  However, both these languages are strict, and I am not
sure at all how to do this in a lazy language.

In any case, this is nontrivial compiler work and considerable research
efforts would be needed. Unfortunately, I don't see how to fund such
research, since the high-performance computing community largely seems to
have given up on functional languages since the demise of the data-flow
languages.

>I'd love to see a little community of matrix manipulators spinning up.  

Yes. There might me a niche for high-level numerical coding, somehwere where
MATLAB is today. MATLAB isn't exactly blazingly fast, still very widespread.
On the other hand, MATLAB is already in that niche. The question to answer
is what advantages a functional language like Haskell could offer in this
niche. We need to come up with these answers, and then convince enough
people outside our own community.

Björn Lisper