[Haskell-beginners] Flying Dutchman sailing blind
Daniel Fischer
daniel.is.fischer at web.de
Wed Oct 13 09:53:27 EDT 2010
On Wednesday 13 October 2010 14:52:58, Jeroen van Maanen wrote:
> So in fact the culprit turned out to be the function updateRationals in
> the module
>
>
> http://lexau.svn.sourceforge.net/viewvc/lexau/branches/totem/src/LExAu/D
>istribution/MDL.hs?view=markup
>
> It is still eating more time than the actual optimizer, so suggestions
> for improvement are still welcome.
First,
approximateRational :: Double -> Rational
approximateRational x =
let (m, e) = decodeFloat x
in if e >= 0
then (m * (2 ^ e)) % 1
else m % (2 ^ (-e))
is exactly toRational, so there's no need for that function. Also, there's
a much faster implementation of toRational (for Double and Float) underway,
I think it will be in the first GHC 7 release due in a couple of weeks.
Anyway, using toRational will let you profit from that with no additional
effort when you get a compiler with that patch.
Second,
data Threshold =
Threshold
{ theBoundA :: Rational
, theBoundB :: Rational
, theCountA :: Integer
, theCountB :: Integer
}
deriving Show
I have not looked much at the code, but it seems likely that you will want
strict fields there,
data Threshold =
Threshold
{ theBoundA :: !Rational
, ...
}
but that's to be tested later.
Third,
mapToThresholds :: [Threshold] -> [Rational] -> [(Rational, Rational)]
mapToThresholds _ [] = []
mapToThresholds thresholds@((Threshold boundA boundB intA intB) :
moreThresholds) rationals@(x : moreRationals)
| x > boundB = mapToThresholds moreThresholds rationals
| x > boundA =
let width = boundB - boundA
count = fromInteger (intB - intA)
mapped = (((x - boundA) * count) / width) + (fromInteger intA)
in (mapped, x) : mapToThresholds thresholds moreRationals
| True = error $ "Rational is too small: " ++ (show x) ++ " < " ++ (show
boundA)
mapToThresholds [] (x : _) = error $ "Rational is too big: " ++ (show x)
will probably profit from making mapped strict,
let ...
in mapped `seq` (mapped, x) : mapToThreholds ...
Now updateRationals:
updateRationals :: Integer -> [(Integer, Rational)] -> Integer ->
[(Integer, Rational)]
updateRationals previousWeight previousRationals w
else let mapped = mapToThresholds thresholds boundaries
mappedIntervals = zip ((0, 0) : mapped) mapped
((_, a), (_, b)) = foldl1' maxMappedInterval
mappedIntervals
That's no good, unfortunately.
maxMappedInterval :: ((Rational, Rational), (Rational, Rational)) ->
((Rational, Rational), (Rational, Rational)) -> ((Rational, Rational),
(Rational, Rational))
maxMappedInterval ((ma, a), (mb, b)) ((mc, c), (md, d)) =
if md - mc > mb - ma
then ((mc, c), (md, d))
else ((ma, a), (mb, b))
foldl1' evaluates the result of maxMappedInterval to weak head normal form,
that is to the outermost constructor.
Depending on what the optimiser does, that may or may not evaluate the
condition md - mc > mb - ma, but it will *not* look at a, b, c, d, and at
least the second components of the inner pairs happily build thunks,
possibly keeping references to the elements of the list already processed,
so keeping stuff from being garbage collected.
What you need is a strict type to contain your Rationals,
data SRQ = SRQ !Rational !Rational !Rational !Rational
maxMappedInterval :: SRQ -> ((Rational,Rational)) -> SRQ
maxMappedInterval s@(SRQ ma a mb b) ((mc,c),(md,d))
| mb - ma < md - mc = SRQ mc c md d
| otherwise = s
Then the foldl1' will evaluate all components and you don't get thunks or
space leaks.
More information about the Beginners
mailing list