[Haskell-beginners] The cost of generality, or how expensive is realToFrac?

Thu Sep 16 17:33:53 EDT 2010

On Thursday 16 September 2010 21:47:43, Greg wrote:
> Wow, thanks for all the analysis on this, Daniel!
>
> So, I think the summary of your evaluation is that realToFrac does just
> fine in almost all cases

I wouldn't say that. realToFrac does fine in those cases where a rewrite 
rule provides a fast conversion (or one of the types bewteen which you want 
to convert is Rational, when you get fromRational or toRational - the 
latter can be improved in several cases).

> due to careful optimization and outperforms my
> floatToFloat trick (which answers the question I flagrantly begged...
> ;~) except that I can't access those optimizations because the OpenGL
> types are hidden behind newtypes and buried in a library.
>
> I've confirmed your results to make sure I could.
>
> I went a step further and typed convert for GLclampf expecting to see
> unoptimized performance, but it ran just as fast as Double->Float.
>
> convert :: Double -> GL.GLclampf
> convert = realToFrac

It would be interesting to see what core GHC produces for that (you can get 
the core with the `-ddump-simpl' command line flag [redirect stdout to a 
file] or with the ghc-core tool [available on hackage]).
If it runs as fast as realToFrac :: Double -> Float (with optimisations), 
GHC must have rewritten realToFrac to double2Float# and it should only do 
that if there are rewrite rules for GLclampf.
In that case, the problem is probably that GHC doesn't see the realToFrac 
applications because they're too deeply wrapped in your coordToCoord2D 
calls.

If that is the problem, it might help to use {-# INLINE #-} pragmas on 
coordToCoord2D, fromCartesian2D and toCartesian2D.
Can you try with realToFrac and the {-# INLINE #-} pragmas?

>
> And still ran faster than floatToFloat.  However there's no denying that
> floatToFloat runs *much* faster than realToFrac in the larger
> application.  Profiling shows floatToFloat taking about 50% of my CPU

That's too much for my liking, a simple conversion from Double to Float 
shouldn't take long, even if the Float is wrapped in newtypes (after all, 
the newtypes don't exist at runtime).

> time, but the frame rate is close to (though not quite) 30 fps.  Using
> realToFrac takes seconds just to display the first frame.
>
> As it stands, floatToFloat is responsible for 50% of my CPU time which I
> think is mainly a tribute the OpenGL implementation, but still makes it
> the obvious target for optimization.  Having gotten all excited about
> the new tool in my box, I wrote the following rule:
>
> {-# RULES
> "floatToFloat/id" floatToFloat=id
> "floatToFloat x2" floatToFloat . floatToFloat = floatToFloat
>  #-}
>
> Neither of which seems to fires in this application,

GHC reports fired rules with -ddump-simpl-stats.
Getting rules to fire is a little brittle, GHC does not try too hard to 
match expressions with rules, and if several rules match, it chooses one 
arbitrarily, so your rules may have been missed because the actual code 
looked different (perhaps because other rewrite rules fired first).

> but I did get the
> first one to fire by importing the same file into my benchmark.
>
> The next obvious step is to optimize a level up in the call hierarchy,
> and rewrite coordToCoord2D since I know my original pair2vertex function
> was faster.  So, I added this rule in the file where Vertex2 is made an
> instance of Coord2D:
>
> {-# RULES
> "coordToCoord2D/p2v2"  coordToCoord2D = pair2vertex
>  #-}
>
> which refers to two functions, each in different files (I don't think
> that matters, but mention it just in case)
>
> pair2vertex :: (Num a) => (a,a) -> GL.Vertex2 a
> pair2vertex (x,y) = GL.Vertex2 x y
>
> coordToCoord2D :: (Coord2D a, Coord2D b) => a -> b
> coordToCoord2D = fromCartesian2D . toCartesian2D
>
>
> directly after my coordToCoord2D definition I have this rule as well:
>
> {-# RULES
> "coordToCoord2D/id" coordToCoord2D = id
> "coordToCoord2D x2" coordToCoord2D . coordToCoord2D = coordToCoord2D
>  #-}
>
>
> I get a compile time error that I can't make sense of.  It's asking me
> to put a context on my rule, but I can't find any references on how to
> do that...
>
> -----------
>
>     Could not deduce (Num a)
>       from the context (Coord2D (Vertex2 a), Coord2D (a, a))
>       arising from a use of `pair2vertex'
>                    at GCB/OpenGL/Geometry.hs:32:40-50
>     Possible fix:
>       add (Num a) to the context of the RULE "coordToCoord2D/p2v2"
>     In the expression: pair2vertex
>     When checking the transformation rule "coordToCoord2D/p2v2"
>
>
> -----------
>
> The file:line:column is the "pair2vertex" token in the rule I list
> above.

Yes, there's a Num constraint on pair2vertex, but not on coordToCoord2D, so 
it's not type correct.
You could try removing the Num constraint from pair2vertex or add the 
constraint to the rule,

{-# RULES
"coordToCoord2D/p2v2" forall a. Num a =>  (coordToCoord2D :: (a,a) -> 
GL.Vertex2 a) = pair2vertex
   #-}

(well, I don't know whether that's the correct way, but you can try).

>
> Cheers--
>  Greg