[Haskell-cafe] nbody (my own attempt) and performance problems

Ryan Dickie goalieca at gmail.com
Wed Nov 28 15:01:21 EST 2007


On Nov 28, 2007 11:18 AM, Dan Weston <westondan at imageworks.com> wrote:

> Just out of curiosity...
>
> > --some getter functions
> > pVel !(_,vel,_) = vel
> > pPos !(pos,_,_) = pos
> > pMass !(!_,!_,!mass) = mass
>
> What does the !(...) buy you? I thought tuples were already strict by
> default in patterns (you'd need ~(...) to make them irrefutable), so
> isn't the above equivalent to:
>
> --some getter functions
> pVel  (_,vel,_) = vel
> pPos  (pos,_,_) = pos
> pMass (!_,!_,!mass) = mass
>

Yes you are right. I did not need those extra ! in front of the tuples.


>
> And why in any case are the tuple components for pMass strict but for
> pVel and pPos non-strict? Is is that mass is always used but position
> and velocity are not?
>

Without all three components of the tuple in pMass being !'d, I find a 2x
slowdown. This include trying pMass (_,_,!mass), pMass(!_,_,!mass), and all
other combinations.

Why that happens.. I do not know. pMass is only used where its argument (the
planet tuple) was defined strict like below. I would expect p1 to be fully
evaluated before pMass p1 is ever called.

offset_momentum (!p1,p2,p3,p4,p5) = ( pp1,p2,p3,p4,p5 ) where
        pp1 = ( pPos p1,ppvel,pMass p1 )
        ....

>
> Ryan Dickie wrote:
> > I sat down tonight and did tons of good learning (which was my goal).
> > Yes, the variable names in the unrolling is a little "ugly" but it helps
> > to read the C++ version for context. There are two for loops (advN is
> > each inner one unrolled). the other function names match the C++
> > version.  It was my goal to implement an unrolled version of that.
> >
> > Fortunately, my performance is excellent now. It is only 4x slower than
> > the C++ version and 2x slower than the Haskell one listed (which uses
> > pointer trickery). I am sure there could be more done but I am at my
> > limit of comprehension. But if I may guess, I would say that any speed
> > issues now are related to a lack of in place updating for variables and
> > structures.
> >
> > I'd also like to thank everyone for their help so far. I have attached
> > my latest version.
> >
> > --ryan
> >
> > On Nov 27, 2007 7:14 PM, Sterling Clover < s.clover at gmail.com
> > <mailto:s.clover at gmail.com>> wrote:
> >
> >     The first step would be profiling -- i.e. compiling with -prof
> -auto-
> >     all to tag each function as a cost center, then running with +RTS -p
> >     to generate a cost profile. The problem here is you've got massive
> >     amounts of unrolling done already, so it's sort of hard to figure
> out
> >     what's doing  what, and the names you've given the unrolled
> functions
> >     are... less than helpful. (first rule of optimization: optimize
> >     later.)  The use of tuples shouldn't be a problem per se in terms of
> >     performance, but it really hurts readability to lack clear type
> >     signatures and types. You'd probably be better off constructing a
> >     vector data type as does the current Haskell entry -- and by forcing
> >     it to be strict and unboxed (you have nearly no strictness
> >     annotations I note -- and recall that $! only evaluates its argument
> >     to weak head normal form, which means that you're just checking if
> >     the top-level constructor is _|_) you'll probably get better
> >     performance to boot. In any case, declaring type aliases for the
> >     various units you're using would also help readability quite a bit.
> >
> >     --S
> >
> >     On Nov 27, 2007, at 5:41 PM, Ryan Dickie wrote:
> >
> >      > I thought it would be a nice exercise (and a good learning
> >      > experience) to try and solve the nbody problem from the debian
> >      > language shootout. Unfortunately, my code sucks. There is a
> massive
> >      > space leak and performance is even worse. On the bright side, my
> >      > implementation is purely functional. My twist: I manually
> unrolled
> >      > a few loops from the C++ version.
> >      >
> >      > I believe that most of my performance problems stem from my abuse
> >      > of tuple. The bodies are passed as a tuple of planets, a planet
> is
> >      > a tuple of (position, velocity, mass) and the vectors position
> and
> >      > velocity are also tuples of type double. My lame justification
> for
> >      > that is to make it nice and handy to pass data around.
> >      >
> >      > Any tips would be greatly appreciated.
> >      >
> >      > --ryan
> >      > <nbody3.hs>
> >      > _______________________________________________
> >      > Haskell-Cafe mailing list
> >      > Haskell-Cafe at haskell.org <mailto:Haskell-Cafe at haskell.org>
> >      > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071128/5df99ed2/attachment.htm


More information about the Haskell-Cafe mailing list