[Haskell-cafe] array fusion interface

Evan Laforge qdunkan at gmail.com
Tue Jul 22 09:17:59 EDT 2008


With all the noise lately about high performance array libraries with
list-like interfaces, such as bytestring, storablevector, uvector, and
vector, I thought I would try to make use of one in a project of mine,
and I'm either bumping up against the limits of its expressiveness, or
am missing out on how to express my problem.

I have streams of samples with irregular sampling rates, so they look
like [(Time, SampleVal)].  In practice, this means [(Double, Double)].
 I can do this with storablevector by either storing two Vector
Double, or by making a (Double, Double) Storable instance (more
natural but requires -XFleixbleInstances).

Now most functions that operate over a sample stream actually want a
pair of samples so they can get an exact time with linear
interpolation.  Combining two streams requires them to both be
resampled so their points line up, e.g. with a list implementation
(ignoring the edges):

resample as@((ax0, ay0) : (ax1, ay1) : _)  bs@((bx0, by0) : (bx1, by1) : _)
    | ax1 == bx1 = (ax1, (ay1, by1)) : resample (tail as) (tail bs)
    | ax1 < bx1 = (ax1, (ay1, y_at (bx0, by0) (bx1, by1) ax1))
        : resample (tail as) bs
    | otherwise = (bx1, (y_at (ax0, ay0) (ax1, ay1) bx1, by1))
        : resample as (tail bs)
y_at (x0, y0) (x1, y1) x = (y1 - y0) / (x1 - x0) * (x - x0) + y0

So there are a number of things here that don't seem to work so well
with a vector interface.  The first is that this returns [(Val, (Val,
Val))].  I could either write an instance of Storable for (Double,
(Double, Double)), or I could pass  a function (Val, Val) -> Val.  It
looks like uvector handles zips more nicely though.  The second is
that this, like most such functions, wants a pair of samples to do
linear interpolation.  With lists I can do pattern matching as above
or 'zip xs (drop 1 xs)', and with vectors the slightly less natural
mapAccumL.  Ok, then the third problem is that this progresses at a
variable rate down two vectors, and I don't know any way to express
that.  Also, as a fourth problem, many functions taking samples will
wind up generating more samples, which requires a concatMap kind of
thing, which uvector doesn't seem to have, but in storablevector and
lazy bytestring involves converting the whole thing to a list and
back.

What I'd ideally like is a lazy stream of unpacked chunks ala
ByteString.Lazy.  It seems like it might be possible to do something
with ST and mutable arrays, and somehow come up with an abstraction
that hides the packing and chunks and freezing but is still flexible
enough to write something like 'resample', but I haven't thought about
what that would look like.  And it would be much nicer to try to fit
something into the array fusion kind of interface.  Is it possible?
Is concatMap and "complicated zip" incompatible with the listlike
array interface and fusion?  Is there a better way to express this
that avoids the problems?  Initially this seemed like the perfect
application for storablevector or uvector, but maybe it's not
actually.


More information about the Haskell-Cafe mailing list