[Haskell-beginners] The cost of generality, or how expensive is realToFrac?

Wed Sep 15 14:50:13 EDT 2010

Hey, thanks, Daniel.

I hadn't come across rewrite rules yet.  They definitely look like something worth learning, though I'm not sure I'm prepared to start making custom versions of OpenGL.Raw...

It looks like I managed to put that battle off for another day, however.  I did look at how realToFrac is implemented and (as you mention) it does the fromRational . toRational transform pair suggested in a number of sources, including Real World Haskell.  Looking at what toRational is doing, creating a ratio of integers out of a float it seems like a crazy amount of effort to go through just to convert floating point numbers.

Looking at the RealFloat class rather that Real and Fractional, it seems like this is a much more efficient way to go:

floatToFloat :: (RealFloat a, RealFloat b) => a -> b
floatToFloat = (uncurry encodeFloat) . decodeFloat

I substituted this in for realToFrac and I'm back to close to my original performance.  Playing with a few test cases in ghci, it looks numerically equivalent to realToFrac.

This begs the question though-- am I doing something dangerous here?  Why isn't this the standard approach?  

If I understand what's happening, decodeFloat and encodeFloat are breaking the floating point numbers up into their constituent parts-- presumably by bit masking the raw binary.  That would explain the performance improvement.  I suppose there is some implementation dependence here, but as long as the encode and decode are implemented as a matched set then I think I'm good.

Cheers--
 Greg

On Sep 15, 2010, at 1:56 AM, Daniel Fischer wrote:

> On Wednesday 15 September 2010 02:51:01, Greg wrote:
>> First, to anyone who recognizes me by name, thanks to the help I've been
>> getting here I've managed to put together a fairly complex set of code
>> files that all compile together nicely, run, and do exactly what I
>> wanted them to do.  Success!
>> 
>> The trouble is that my implementation is dog slow
>> 
>> Fortunately, this isn't the first time I've been in over my head and I
>> started by putting up some simpler scaffolding- which runs much more
>> quickly.  Working backwards, it looks like the real bottle neck is in
>> the data types I've created, the type variables I've introduced, and the
>> conversion code I needed to insert to make it all happy.
>> 
>> I'm not sure it helps, but I've attached a trimmed down version of the
>> relevant code.  What should be happening is my pair  is being converted
>> to the canonical form for Coord2D which is Cartesian2D and then
>> converted again to Vertex2.  There shouldn't be any change made to the
>> values, they're only being handed from one container to another in this
>> case (Polar coordinates would require computation, but I've stripped
>> that out for the time being).  However, those handoffs require calls to
>> realToFrac to make the type system happy, and that has to be what is
>> eating up all my CPU.
> 
> Not all, but probably a big chunk of it.
> The problem is that the default implementation of realToFrac is
> 
> realToFrac = fromRational . toRational
> 
> a) with that implementation, realToFrac :: Double -> Double is not the 
> identity (doesn't respect NaNs)
> b) it's slow, there are no special operations to convert Double, Float etc. 
> from/to Rational.
> 
> For a lot of types, GHC provides rewrite rules (you need to compile with 
> optimisations to have them fire) which give faster versions (with somewhat 
> different behaviour, e.g. realToFrac :: Double -> Double is rewritten to 
> id, realToFrac between Float and Double uses primitive widening/narrowing 
> ops, for several newtype wrappers around Float/Double there are rules too).
> 
>> 
>> I think there are probably 4 calls to realToFrac.  If I walk through the
>>  code, the result, given the pair p, should be: Vertex2 (realToFrac
>> (realToFrac (fst p)))  (realToFrac (realToFrac (snd p)))
>> 
>> I'd like to maintain type independence if possible, but I expect most
>> uses of this code to feed Doubles in for processing and probably feed
>> GLclampf (Floats, I believe)
> 
> newtype wrapper around CFloat, which is a newtype wrapper around Float
> 
> Unfortunately, there are no rewrite rules in the module where it is 
> defined, apparently neither any other module that has access to the 
> constructor. And the constructor is not accessible from any of the exposed 
> modules, so as far as I know, you can't provide your own rewrite rules.
> 
>> to the OpenGL layer.  If there's a way to
>> do so, I wouldn't mind optimizing for that particular set of types.
>>  I've tried GLdouble, and it doesn't really improve things with the
>> current code.
>> 
>> Is there a way to short circuit those realToFrac calls if we know the
>> input and output are the same type?  Is there a way merge the nested
>> calls?
> 
> You can try rewrite rules
> 
> {-# RULES
>   "realToFrac2/realToFrac"         realToFrac . realToFrac = realToFrac
>   "realToFrac/id"                  realToFrac = id
>   #-}
> 
> but I'm afraid the second won't work at all, then you'd have to specify all 
> interesting cases yourself (there are rules for the cases Double -> Double 
> and Float -> Float in GHC.Float, rules for converting from/to CFloat and 
> CDouble in Foreign.C.Types, so those should be fine too)
>   "realToFrac/GLclampf->GLclampf"  realToFrac = id :: GLclampf -> GLclampf
> and what ese you need.
> Whether the first one will help (or even work), I don't know either, you 
> have to try.
> 
>> 
>> Any other thoughts on what I can do here?  The slow down between the two
>> implementations is at least 20x, which seems like a steep penalty to
>> pay.
> 
> In case of emergency, put the needed rewrite rules into the source of 
> OpenGLRaw yourself.
> 
>> 
>> And while I'm at it, is turning on FlexibleInstances the only way to
>> create an instance for (a,a)?
> 
> Yes. Haskell98 doesn't allow such instance declarations, so you need the 
> extension.
>