[Haskell-cafe] Re: Difficult memory leak in array processing

Wed Nov 29 14:27:50 EST 2006

> I personally find doing higher order functions with IO extremely
> difficult, and the resulting compiler errors are often downright scary.
> But this is probably a direct consequence my rather limited
> understanding of the monadic operators that the do notion hides. It
> seems that one has to be /very/ good in Haskell before one can do IO
> effectively. Hmm.

Well, as good as a necromancer has to be :) But seriously, I think the
most mind boggling point about IO actions is that those are higher
order, too. I for myself have taken the point of view that values of
type (IO a) are just "plans" that will be executed when the main Haskell
program has done its work. The programs task is to assemble the "main
plan" by sequencing smaller "plans" with (>>=).

> But back to the point, using Haskell lists for representing a largish
> buffer of, say, 16-bit samples that is constantly being refreshed from
> hard disk and sent into the sound system for playback would probably be
> inefficient beyond imagination.

Lists are indeed not suited for massive byte crunching.

If they fit into a black box, you can of course outsource things into C.
In case you were writing a track scheduler à la iTunes, the black box
most likely would be a single function (playSoundFile) which does the
job of handling data from disk to the sound card.

Actual sound processing with Haskell functions is more involved. As
already said, specifying loops over the samples as higher order
functions will save you a lot of headaches. The point is that one just
cannot start writing Haskell code and hope that it will run as fast as a
tight loop in C. Instead, one should do aggressive optimizations at a
few critical points only. And these are exactly the higher order loops.

I have to admit that (accumM) is not very fast because it is able to
change data at arbitrary indexes which therefore have to be kept around.
Most often, you want to process each index exactly once which is better
expressed as a (map) or a (fold). As Bulat and Duncan already said,
Data.ByteString does exactly this for arrays of bytes. Using it or the
generalization promised by Duncan is likely to be the best way to go.

On the implementation level, lazy evaluation is in the way when
crunching bytes. So be sure to tell GHC to make things strict and to
unbox and inline everything it can put its hands on by giving
appropriate command line options. As others already pointed out, the
details are on
    http://haskell.org/haskellwiki/Performance
As a first test, you may want to compile your original code with -O3
-optc-O3 -funfolding-use-threshold=16 and explore what happens.
GHC does a good job with strictness analysis and it's of no use to drown
your code in strictness annotations. Of course, some well placed ones
may hint GHC about things it overlooked.

To mention yet another approach, the image synthesis library Pan
    http://conal.net/pan/
pretends that the programmer writes ordinary Haskell functions, but
under the hood, it's a tiny programming language that gets compiled to
C++. Of course, the main point is that the image synthesis combinators
are strictly less powerful than full Haskell. But as the full power is
unneeded in that context, this doesn't hurt.

While there are audio related projects,
    http://haskell.org/haskellwiki/Libraries_and_tools/Music_and_sound
I don't know whether they focus on speed.

Regards,
afpelmus