[Haskell-beginners] Optimising a combinatorial algorithm?

Bernhard Herzog bh at intevation.de
Sat Sep 26 14:55:05 UTC 2015


On 24.09.2015, Mario Lang wrote:
> TL;DR: My implementation looks elegant (to me), but is rather slow
> compared to a well optimized C++ impl. of the same algorithm.  Runtime
> differences are currently huge.  Looking for input on what I could
> change to match C++ performance, without loosing too much expressiveness
> of the code.
[...]
> Profiling tells me I am spending 90% of all the time in 'dur', which is my
> small helper method to calculate the duration of a PartialVoice,
> PartialMeasure or Voice.
>
> The project is setup to support profiling:
>
> git clone https://github.com/mlang/hbmc
> cd hbmc
> cabal run profiling


Some performance tips:

AFAICT, the main reason the dur methods are so slow is they traverses
the lists much too often. An easy way to reduce that number is to cache
the duration of lists of Sign values in PartialVoice by replacing the
type alias with a new data type like this:

  data PartialVoice = PV (Maybe Rational) [Sign]

To always initialize the duration when constructing a PartialVoice, a
helper function is useful:

  mkPV signs = PV (sumDuration signs) signs

Here sumDuration is defined in the same way as the current dur method
for PartialVoice:

  sumDuration :: Duration a => [a] -> Maybe Rational
  sumDuration = foldl' (liftA2 (+)) (pure 0) . map dur

The dur method for PartialVoice can then be implemented by simply
accessing the cached value:

  instance Duration PartialVoice where

      dur (PV d _) = d

Since that parameter of PV is lazy the actual sum is only computed when
the value is needed and for any given PV value it is computed at most
once.

On my system, that change alone improved the running time by a factor of
almost 10 according to the profiler output.

The type Voice can obviously be treated in a similar way.


Some other changes that can improve the speed:


Use Integer instead of Rational.

This should be possible since all duration values appear to be integer
multiples of 1/128, so you can just represent them by that factor and
use a newtype wrapper for increased type safety.


Parameterize Sign with the type of the duration.

AFAICT the only reason to wrap the duration in a Maybe is that you
cannot assign a duration during parsing, so the parsers always assign
Nothing as duration. The pvs function will always assign a Just-value to
each Sign, however. So after parsing, the Measure has Nothing for the
duration in every Sign, but the Measures returned by ms always have Just
values. You you could express this in the types by parameterizing Sign
and the other types with the type of a duration. The parser would then
create e.g. Sign () values and pvs would turn those into e.g. Sign
Rational values.

This improves type safety, makes the meaning clearer and simplifies the
code because you don't need to lift operations into the Maybe or do case
analyses.


Some stylistic things:

  instance Duration Measure where
      dur m = case m of [] -> Nothing; otherwise -> dur (head m)

would better be written like this:

  instance Duration Measure where

      dur []    = Nothing
      dur (x:_) = dur x

I.e. try not to use head. Use pattern matching instead. Particularly in
cases like this, where you already pattern match on the list in
question. This applies to some other functions as well, e.g. allEqDur.

Also, if you want to ignore parts of a pattern match, use "_" as the
pattern, not "otherwise". In the way you used it, it introduces a new
binding in that branch of the case expression and shadows the definition
from the Prelude. Idiomatic use of "otherwise" is as the condition on
the catch-all case of a guard.

If you compile your code with GHC's -Wall option it warns about this and
other things.


  Bernhard


More information about the Beginners mailing list