[Haskell-cafe] Generalised merge
Paul Johnson
paul at cogito.org.uk
Thu Mar 15 18:22:56 EDT 2007
The most common kind of primitive recursive function I find myself
writing these days is a variation on the theme of merging two sorted lists.
You can see some examples in my Ranged Sets library at
http://ranged-sets.sourceforge.net/. For instance:
-- | Set union for ranged sets. Infix precedence is left 6.
rSetUnion, (-\/-) :: DiscreteOrdered v => RSet v -> RSet v -> RSet v
-- Implementation note: rSetUnion merges the two lists into a single
-- sorted list and then calls normalise to combine overlapping ranges.
rSetUnion (RSet ls1) (RSet ls2) = RSet $ normalise $ merge ls1 ls2
where
merge ls1 [] = ls1
merge [] ls2 = ls2
merge ls1@(h1:t1) ls2@(h2:t2) =
if h1 < h2
then h1 : merge t1 ls2
else h2 : merge ls1 t2
-- | Set intersection for ranged sets. Infix precedence is left 7.
rSetIntersection, (-/\-) :: DiscreteOrdered v => RSet v -> RSet v -> RSet v
rSetIntersection (RSet ls1) (RSet ls2) =
RSet $ filter (not . rangeIsEmpty) $ merge ls1 ls2
where
merge ls1@(h1:t1) ls2@(h2:t2) =
rangeIntersection h1 h2
: if rangeUpper h1 < rangeUpper h2
then merge t1 ls2
else merge ls1 t2
merge _ _ = []
Union also has its own merge function.
The worst case I've come across was at work. I can't talk about the
details, but it involved manipulating two functions of time represented
by lists of samples. So I had a type TimeFunc = [(Value, Time)], and
the job was to compare two TimeFuncs with samples at different times.
Step 1 was to interpolate each TimeFunc with the values for the times in
the other TimeFunc, giving a result of type [(Value, Value, Time)] for
the union of all the times in both original TimeFuncs. I wrote a truly
hairy zipTimeFunc function with guards to match each possible case. It
worked, but it must have been 100 lines if you include the comments to
explain each case and demonstrate totality.
So I'm wondering if anyone has a more general pattern. Unfold? Some
variation on a theme of zip? I once tried writing a generalised merge,
but it needed half a dozen functions as arguments to handle all the
various cases. It was kludgier than just rolling a new merge routine
every time. And I don't think it would have handled the TimeFunc case.
Paul.
More information about the Haskell-Cafe
mailing list