[Haskell-cafe] Re: Difficult memory leak in array processing

Tue Nov 28 05:42:10 EST 2006

apfelmus at quantentunnel.de wrote:
> I tried to show how the code can be rewritten in a more declarative
> manner by the use of the higher order function (accumM) which is akin to
> the accumulation function (Data.Array.accum) for immutable arrays. This

Ok, given this explanation and your detailed annotations about the code,
the whole thing now makes much more sense to me. This is indeed a much
more Haskellish, more beautiful and certainly much less evil way to
perform the task that I wanted to perform. Thank you for taking the time
to explain the whole thing in such a detail; if nothing else, I've
gained some pretty valuable insights on How To Do Things The Functional Way.

Later in your reply you suggest that IOUArray would be an overkill for
statistical analysis of random numbers, but possibly feasible for an
actual audio buffer. I actually had both of these scenarios in mind. I
have a version of the code that processes plain'ol Haskell lists instead
of evil IOUArrays. Needless to say that this code is much more elegant
and heavier on the folds and other higher order functions than the
IOUArray code.

I personally find doing higher order functions with IO extremely
difficult, and the resulting compiler errors are often downright scary.
But this is probably a direct consequence my rather limited
understanding of the monadic operators that the do notion hides. It
seems that one has to be /very/ good in Haskell before one can do IO
effectively. Hmm. In imperative languages IO is the first thing one
learns, before understanding how to combine a series of operations into
a complete algorithm. Hence the 'void doIt()' functions that are so
ubiquitous in the C and Java code out there. The folks may say that
real-life Haskell code is littered with strictness annotations, but
real-life C/C++/C#/Java code is littered with void doIt() functions,
which are an order of magnitude worse from code clarity point of view...

But back to the point, using Haskell lists for representing a largish
buffer of, say, 16-bit samples that is constantly being refreshed from
hard disk and sent into the sound system for playback would probably be
inefficient beyond imagination. The overhead of one list element is
probably larger than the size of the element itself (16 to 32 bit
integers) and a new list would have to be generated every time more data
is read from the hard disk. So the garbage collector would be working
really hard all the time. And this list would have to be converted into
an IO buffer for passing it to the C sound API anyway. These were the
reasons why I went for the IOUArray way to begin with. For this
particular problem, probably the most natural and efficient solution is
to have a static buffer in memory to which data is read from the hard
drive and which is then passed to the sound API for playback. For me,
this seems the best way to do the job.

But even this is not so obvious since I don't really know the
performance implications of different Haskell constructs and the
strong/weak points of the GHC compiler. The initial code I posted
(without number counting) seems to take an awful long time to perform
the task of filling the array with random numbers, especially compared
to a Java (or actually Scala) solution which does more or less the same
thing in 1/30 of the time. I will probably run some comparisons of my
original code to the code you posted to see if there is a significant
difference in not only elegance but performance too.

Regards,
Niko