[Haskell-beginners] Processing data from microphone interactively

Sat Oct 24 09:40:28 UTC 2015

Hello Martin,

in digital signal processing (DSP), audio samples are traditionally 
processed in *blocks*. Typical blocks sizes for real-time processing are 
64, 128 or 256 bytes.

The reason for this is that audio processing is a performance-sensitive 
task. If your code is too slow, then it cannot process all audio in time 
and there will be jitter. Typically, the operations that are applied to 
a single block are fairly limited (mixing, convolution, ...) and can be 
optimized into a tight loop, which you can then reuse as a "black box". 
In contrast, operations that act on blocks (envelopes, ...) are more 
open-ended and you would have to pay attention to optimizing them each 
and every time you write a program.

This is related to the concepts of "audio rate" and "control rate". The 
former is the frequency at which audio is sampled, i.e. the frequency 
"within" a block, while the latter corresponds to more coarse-grained 
operations, that are approximately the same on every block.

For you, this means that you probably want to call the `simpleRead` 
function with a block size of 128 and process each block individually 
before requesting the next. If individual processing proves too slow, 
you will have to use data structures that are closer to the machine, and 
call the `simpleReadRaw` function instead.

Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com

Martin Vlk wrote:
> Hi,
> I am looking at reading sound from a microphone and controlling some
> other activity based on the sound data as they come. The motivation for
> this is writing some interactive animated graphics controlled by
> properties of the sound from mic.
> 
> I am using the pulseaudio-simple library to read sound from the computer
> mic and that works fine. However the library function basically returns
> sound samples as a list of predefined length and this is not well suited
> for the kind of real-time processing I need.
> 
> I am looking for advice on what would be a good idiomatic way to design
> such a program in Haskell.
> 
> From some research I am imagining I need something like the conduit
> library to connect the sound data to other parts of my program, but I am
> not sure how that would work or if it is a good idea in the first place.
> 
> Or should I use some of the FRP libraries for this purpose?
> Or some other approach?
> 
> I'd appreciate some advice on the direction to take.
> 
> Many Thanks
> Martin