[Haskell-beginners] Flying Dutchman sailing blind

Daniel Fischer daniel.is.fischer at web.de
Fri Oct 15 05:32:33 EDT 2010


On Friday 15 October 2010 09:45:46, Jeroen van Maanen wrote:
> On 2010-10-13 15:53, Daniel Fischer wrote:
> > On Wednesday 13 October 2010 14:52:58, Jeroen van Maanen wrote:
> >> So in fact the culprit turned out to be the function updateRationals
> >> in the module
> >>
> >>
> >> http://lexau.svn.sourceforge.net/viewvc/lexau/branches/totem/src/LExA
> >>u/D istribution/MDL.hs?view=markup
> >>
> >> It is still eating more time than the actual optimizer, so
> >> suggestions for improvement are still welcome.
> >
> > First,
> >
> > approximateRational :: Double -> Rational
> > [...]
> >
> > is exactly toRational, so there's no need for that function. Also,
> > there's a much faster implementation of toRational (for Double and
> > Float) underway, I think it will be in the first GHC 7 release due in
> > a couple of weeks. Anyway, using toRational will let you profit from
> > that with no additional effort when you get a compiler with that
> > patch.
>
> This was one of the first tricky bits that I converted from Java (in
> fact, it was written in Clojure, a Lisp variant that runs on the Java
> virtual machine) to Haskell. I was fighting with implicit type
> inference, and I didn't know then how to fix the type of the value
> returned by the random number generator. Thanks for pointing it out.
>
> > Second, [Third ... Now updateRationals]
>
> Thanks for taking me through this step-by-step. It helps me tremendously
> in developing an intuition about what is going on in my program. It
> feels a bit like when I started writing XSLT style-sheets. ;-)
>
> Looking at the graphical presentation of heap usage[1] I guess that this
> solved most of my heap problems.

According to that profile, your live heap starts out at between 3500 and 
4000 kilobytes and decreases from then on. It may be less than optimal, but 
a heap problem looks quite different.

You can get more info by doing profiling runs with -hc, -hb and so on.

> However the large amount allocated to
> TSO (I guess that is Thread State Object) suggests to me that I still
> have an issue with stack usage here.

You seem to fork/spark a lot of threads, so a lot of TSOs are generated.

>
> Another thing that I noticed when I went over my algorithm again, is
> that the list of weight/bound pairs is sorted according to the bound
> component on each recursion step. This also happens when the produced
> data structure is used later on in the program: I sort all pairs with a
> weight smaller than a certain threshold by bound. Is there a some
> prepackaged data structure around that would efficiently support this
> usage pattern?
>
> Oh, and I Googled for "Haskell BLACKHOLE", but didn't find an
> explanation of what it is. Is the amount of black holes on the heap of
> my program an indication of some inefficiency that can be removed?

I think section 8 of

http://www.haskell.org/~simonmar/papers/multicore-ghc.pdf

will be useful reading.
As a guess, it may help to set up some data single-threadedly before going 
parallel.

>
> Cheers,  Jeroen
>
> [1] http://lexau.org/pub/lexau-heap.pdf



More information about the Beginners mailing list