[Haskell-cafe] Re: A guess on stack-overflows - thunksbuild-upandtail recursion

Sun Mar 22 07:06:49 EDT 2009

>I just found out about GHood through this thread, and since it  impressed me very much to see 
>something so cool, I feel bad making  this comment... but I am always disturbed by the flickering 
>effect  produced by java applets in my browser (FF 3.0) while scrolling.  From  an implementation 
>standpoint this is obviously a nitpick, but from a  designer standpoint it nearly single-handedly 
>kills any prospect of my  putting it up on a page.

No problem, but thanks for qualifying the criticism!-) I notice little of
that in Opera, but generally, yes, Java is a bit heavyweight for the job.
But it got the job done when alternatives were few and still does until
someone provides a lightweight viewer. Here are a couple more
examples, from this thread and from the recent "Safe Lazy IO"
announcement,  for those who have just found out about GHood
and are following this thread:

1. chaining readFile

In http://www.haskell.org/pipermail/haskell-cafe/2009-March/058205.html ,
the motivating example of countLines is used. GHood can't observe when
file handles are closed (that would have needed reliable Haskell finalizers,
the lack of which led to me abandoning further development of GHood),
but if we optimistically assume that file handles are closed when the file
contents have been observed in full, to the closing "[]", we can observe
the other relevant aspects of the discussion.

import Debug.Observe
import System.Environment

readFileO f = observe ("readFile "++f) readFile f

countLines1 = runO $ print . observe "length" length . lines . concat =<< mapM readFileO =<< getArgs
countLines2 = runO $ print . observe "sum" sum =<< mapM (fmap (length . lines) . readFileO) =<< 
getArgs
countLines3 = runO $ print . observe "sum" sum =<< mapM (((return$!) . length . lines =<<) . 
readFileO) =<< getArgs

The animations for these instrumented versions should be fairly
self-explanatory if you stare at them long enough:-) Compare, eg,

ghc  -e ':set args a b' -e countLines2 RF
ghc  -e ':set args a b' -e countLines3 RF

with a couple of _small_ input files (since we do not depend on running
out of handles to observe the effects we're interested in, two files are
sufficient, and large input files would only obscure the animation):

$ cat a b
a1
a2
a3
b1
b2
b3

2. tail recursion and strictness, if an external data structure is in the way

Taking Data.IntMap as an example, because it tends to come up often,
the first problem we run into is defining an instance of Observable. That
is usually straightforward boilerplate code, but IntMap's representation
is not exported, so we have to make do with an abstract observer:

import qualified Data.IntMap as IM
import Debug.Observe
import Data.List

instance Observable a => Observable (IM.IntMap a)
  where observer im = send "IntMap" (return IM.fromList << IM.toList im)

Now, the problem at hand is that while IntMap is strict in its structure
and keys, it isn't strict in its values, so if we just wrap the usual strict
foldl' around Data.IntMap.insert, we do not get the intended effect
(assuming here that the intention was to evaluate all parts of the IntMaps).

The standard workaround for IntMaps is to hook the evaluation of the
values to some part of the data structure that is certain to be evaluated,
in this case the keys. Then the construction of the IntMap forces the
keys, which forces the values.

This can be observed in the animations for these two variants:

f k = 2*k

version1 = runO $ print $ observe "foldl'" foldl' (\m (k,v)->IM.insert k v m)
                    IM.empty $ [(i,observe "f" f i)|i<-[1,2]]

version2 = runO $ print $ observe "foldl'" foldl' (\m (k,v)->(IM.insert k $! v) m)
                    IM.empty $ [(i,observe "f" f i)|i<-[1,2]]

Observations of higher-order functions take some getting used to, so
have a look at event 97/130 for "version1" to get your bearings before
diving into the details of the animation (you might want to enlarge the
window or scale down the visualization).

The observation of "foldl'" reveals a function that takes a function as
first parameter, an empty IntMap as second parameter, and a two-element
list as a third parameter, producing a two-element IntMap as its result.
The first parameter of "foldl'" is a function (the lambda abstraction with
IM.insert in the code) that has been called two times at this stage (two
children of one FUN node), once to map an empty IntMap and a pair
to a one-element IntMap, and a second time to map the one-element
IntMap and a pair to a two-element IntMap.

At this point, the IntMaps and their keys have been evaluated, but none
of their values have. In fact, the function computing the values, "f", has
not been called once. That is all going to happen now, as printing the
result forces the values. One can almost see the chains of demand in
the animation, as thunks (red) get inspected (yellow), leading to further
inspection elsewhere, leading to results (black) elsewhere, leading to
results replacing the original thunks. It is this feel for the dynamics of
evaluation that gives GHood its edge over Hood, which does all of
the work of generating the observation log.

Now, compare with the animation for "version2". Here "f" gets called
early on (event 26/130), as the IntMaps cannot be constructed without
evaluating their keys, which we have made dependent on evaluating
their values. Again, the animation reveals the chain of dependencies.

Stepping further (event 119/130), we notice that we seem to have
been only partially successful. The values are indeed evaluated before
the keys can be used, but there are still thunks in those IntMaps! That
is because the results of our strictified "insert"s have not been observed
yet, and GHood represents not-yet-observed as "thunk". From this
point on, printing of the result will observe and force the whole of
the IntMap, as before, but we see that this will not involve any further
calls to the function "f" that computes the values, just observations
of what has already been computed.

Hope these new examples give further ideas about what GHood
can and cannot be used for, and how to use it, including "Observable"
instances for abstract types for which we lack the source code
(also note that, as with all visualizations, careful setup is needed
to reveal the points of interest - other ways of instrumenting the
code or providing input files would have been possible, but less
helpful). As mentioned before (and below), the paper goes into
more detail on GHood itself.

Claus

> With that said, I think the canvas+js idea is a wonderful alternative  to proprietary Flash.

I've also often wanted browser-based, interactive, profiling views
instead of the somewhat awkward PostScript generation of static
views.

> Regards,
> Duane Johnson
>
> On Mar 20, 2009, at 5:36 PM, Claus Reinke wrote:
>
>>> It would be great to have a video of this in action up on youtube.
>>> You can simply 'recordmydesktop' on linux (and likely elsewhere),  then
>>> upload the result.
>>
>> I'm curious: how would a non-interactive animation running in Flash
>> in a browser be better than an interactive animation running in Java
>> in a browser?-) When I wrote GHood (many years ago), I explicitly
>> looked into the applet option, in the hope that people would use it
>> to document and discuss observation logs of their favourite Haskell
>> strictness issues, with animations available on their web pages, right
>> next to the discussions.
>> That hasn't happened yet (the only users I was aware of were the
>> DrHylo/Pointless Haskell project), but I just checked, the old .jar  file,
>> the source of which hasn't been perused for a long time, still  worked in applet mode (in Opera, 
>> a browser I didn't know about in  2001,
>> using a Java Runtime several versions removed from that time - try
>> that in Haskell.. ;-), straight from that old project page (which  also explains how to set such 
>> things up), so anyone could add  animations of their favourite examples on their web-pages. But 
>> don't  let that keep you or anyone else from addressing the youtube  audience (one could add 
>> audio explanations, I guess).
>>
>> Claus
>>
>> PS. Perhaps these days, someone should rewrite the log viewer
>>   in Canvas+JavaScript as a more lightweight and modern platform.
>>
>>> It also helps the general adoption cause, having Haskell more visible
>>> and accessible.
>>> claus.reinke:
>>>>>> The problem occurs when the result value is needed and thus  the   thunks need to be reduced, 
>>>>>> starting with the outermost,  which can't   be reduced without reducing the next one .... etc 
>>>>>> and it's these   reduction steps that are pushed on the stack  until its size cause a 
>>>>>> stack-overflow.
>>>>>
>>>>> Yes, that's exactly right, and something that's not often pointed  out.
>>>>
>>>> Btw, this is kind of relative strictness (when is one part of my  program
>>>> needed to answer demands on another part) is the kind of example
>>>> for which old GHood can be helpful (once you get used to the  display).
>>>>
>>>> If you have Java on your machines, try installing GHood [1] (on  hackage thanks to Hugo 
>>>> Pacheco), then things like
>>>>
>>>> ghc -e ':m +Debug.Observe' -e 'printO $ observe "foldr" foldr (+)  0 [1..4] '
>>>> ghc -e ':m +Debug.Observe' -e "printO $ observe \"foldl'\"  foldl' (+) 0 [1..4] "
>>>> ghc -e ':m +Debug.Observe' -e 'printO $ observe "foldl" foldl (+)  0 [1..4] '
>>>>
>>>> This was also among the examples on the GHood home page [2], so  you could try the applet 
>>>> version instead, and in section 4.2 of  the paper [3] (as a "well known strictness problem";-). 
>>>> Page and  paper
>>>> mention a few other similar examples and discuss some differences
>>>> between static (which parts are needed at all) and dynamic  strictness
>>>> (which parts are needed when, relative to other demands).
>>>>
>>>> Claus
>>>>
>>>> [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/GHood
>>>> [2] http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/GHood
>>>> [3] http://www.cs.kent.ac.uk/~cr3/publications/GHood.html
>>>>
>>>> _______________________________________________
>>>> Haskell-Cafe mailing list
>>>> Haskell-Cafe at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe