my experience with ghci debugger extensions
phercek at gmail.com
Thu Feb 5 04:05:39 EST 2009
Hi users of ghci debugger,
This post is going to be a bit longer. Here are few cookies to motivate
you to go on:
* you will probably like it more than printf debugging for a lot of cases
* it can provide a way of implementing Claus Reinke's idea of
breakpoints with a list of identifiers which should be available at
breakpoint location and doing it without recompilation; here is the link
to Claus' message:
* it gives some idea to ghc team about importance of ghci debugger
related tickets (and whether to implement them just the way they were
A note to ghc developers: Stepping program in ghci debugger sometimes
purges top level bindings and sometimes not. Not sure this is a bug or
feature (whether there is some logic in it). I do not have a simple
example when it purges the bindings. But I did not really look for one.
I would probably post later, but some ghci bugs and missing features are
badly limiting my progress. There is not much more I can think of to
investigate. Maybe somebody will have better ideas how to solve the
obstacles I'm hitting. I'm also posting with hope that people will find
this interesting and ghc team will fix some of the critical bugs and
adds the most critical features, especially if somebody will not have
better debugging tips.
You can get my extensions here:
The extensions are not in a single .ghci file now. The new .ghci file
needs to install a library. The reason is that without the library it is
not manageable any more.
And here are arch linux packaging instructions for my custom ghc (if you
are an arch linux user just download and run makepkg :) ):
The custom ghc is just the same one as ghc 6.10.1 with two more patches:
t2740.patch and loglocal.patch. The first one fixes ticket 2740 and you
will get it in 6.10.2. The second one adds :loglocal command to ghci.
You can extract the patches from the tar file.
If you already read ghci scripting tutorial from Claus Reinke then you
will know how to customize ghciext (that is if you would feel like doing
so). The tutorial is here:
Here is the list of custom commands in ghciext package:
:defs -- list user-defined commands
:. <file> -- source commands from <file>
:redir <var> <cmd>... -- execute <cmd> redirecting stdout to <var>
:grep <pat> <cmd>... -- filter lines matching <pat> from <cmd> output
:* <count> <cmd>... -- run <cmd> <count> times
:x <cmd>... -- run <cmd> with stdout suppressed
:bp <bpArgs> -- put breakpoint at location <bpArgs> (adds
:inject <cc> <c> <sc> <b> -- at location <b> execute <c> if <cc>, and
stop if <sc>
:monitor ["<c>"] <vs> <b> -- show comma separated variables at location
<b> if <c>
:watch <cond> <bpArgs> -- break at location <bpArgs> when <cond> is True
:count (_|<N>) [<bpArgs>] -- count/query/stop execution at location <bpArgs>
:find <var> <cmd>... -- step with <cmd> until <var> is found
:findex <str> <cmd>... -- step with <cmd> until <str> is found
:next [lazy] -- step over; lazy version forces only the top
:xout -- redirect ghci stdout to /dev/null
:out -- redirect ghci stdout back to console
:defs, :., :redir, :grep, are the same as the commands in Claus'
tutorial. The only differences I recall now are:
* grep pattern can be in quotation marks (easier search for spaces)
* grep merges :browse output more nicely
* redir can accept :step, :steplocal etc; i.e. also the commands which
sometimes remove top level bindings
* the commands do not pollute top level bindings so much
The rest will describe my custom commands and now they relate to tickets
in ghci track. If you want to check the tickets mentioned here then the
most easy way is probably selecting them from this list:
The initial big problem was how to limit the amount of information ghci
debugger spits at you when breakpoints with custom scripts are used.
This is also mentioned in forth point of the "unknown" section of ticket
/We can disable a breakpoint with ":set stop N :continue", but this
still prints out the breakpoint info when we stop. Should we print the
info only if there were no commands?/
So I say: yes do it! Just disable any ghci output related to a
breakpoint when the breakpoint has a custom code attached. We can
disable the output ourselves, but then we disable all the output
(including the output of the debugged program). People who are debugging
console applications are busted (especially if the applications are
interactive). This is not an issue for me since my application is not
using console almost at all. I'm solving the problem by prefixing
commands like /:main/ and /:continue /with my command /:x/. This makes
output by default disabled and ghciExt enables it just before a
breakpoint code is run. If the breakpoint continues execution it
disables it again. If the debugged function finishes the output is
enabled by /:x/ itself. A small problem happens when you forget to use
/:x/ e.g. with your /:main/. Then you do not get a prompt when the
program finishes. It's time to notice it and use /:out/ to switch it on.
This is the only minor disadvantage for a gui application debugging.
/:xout /in not that useful (it is mostly used internally by /:x/). It
could be a hidden command (i.e. not registered in defs).
Now when we have ghci "muted" we can get to real goodies: /:monitor/,
/:watch/, and /:count/.
If you like printf debugging, then /:monitor/ can help you. It can
monitor only free variables in the selected expression (on which a
breakpoint is hit) but that was always enough for me during debug
sessions I had. Write a script file (lets say it is named *ghciInit*
(I'll also call it this way later on in this text)) which is like this
/:monitor var1,var2 ModuleName 23
:monitor var3 ModuleName 40
:x :main mainArgument/
Then open ghci with your program and run *:. ghciInit*
And you will get nice log like this:
/(0): var1 = <value01>
var2 = <value02>
(1): var3 = <value03>
(0): var1 = <value11>
var2 = <value12>
(1): var3 = <value13>
Moreover /:monitor/ allows condition in quotation marks to be specified
as the first argument. If you do it then the variable values will be
printed only when the condition is True. Of course the condition can
contain only free variables in the selected expression. Mostly it is not
/:watch/ is a conditional breakpoint which stops when the specified
condition is True.
/:count/ has 3 forms:
/:count _ ModuleName 23/
This never breaks just counts number of times the execution reached
position ModuleName 23
/:count 5 ModuleNmae 23/
This breaks when we reach position ModuleName 23 fifth time
This tells how many times breakpoint number 0 was hit (it can report
the number of hits for any breakpoint created with /:bp/, /:inject/,
/:monitor/, /:watch/, and /:count/).
The first form of /:count/ is interesting when you want to stop earlier
than something bad happens so that you can see why the bad think
happened. Put the first form of /:count/ at the start of the function
with the bug and then put a break in the function which is hit when the
bad think happens maybe with /:watch/. When you stop at the bug place,
check hit count at the start of the function. Add the hit count in your
/:count/ breakpoint in your *ghciInt* file restart and you can use
/:steplocal/ or /:loglocal/ to find out what went wrong.
/:loglocal/ is implemented directly in ghc source code. It does exactly
the same as /:steplocal/ but makes sure that the trace history does not
contain anything outside of the function we step with /:steplocal/. The
problem is that /:steplocal/ works like the code was traced while it is
executing. Mostly the result is that your trace history is loaded with
crap outside of the scope you are interested in. I'll return to
/:loglocal/ later again.
/:inject/ is there when you need something special (/:monitor/,
/:watch/, and /:count/ are implemented with something very like
/:inject/). E.g. when you want to do monitoring of a value but do not
want the associated breakpoint number printed.
/:find/ and /:findex/ are there primary to search trace history.
/:find var1 :back/
will find the variable var1 in your trace history by back stepping it
/:find var1 :step/
will single step forward till variable var1 is in the list of free
/:findex BL/Logic.hs:23 :loglocal/
will fill in your trace history will all the local breakpoints till
location BL/Logic.hs:23 is hit. Having the trace history filled in with
the right stuff is useful for checking out why you got bad results later.
Now lets return to Claus' idea of breakpoints with a list of identifiers
which should be available at the breakpoint location. You can make sure
the identifiers are available with the first form of /:count/. It never
stops but it puts records to the trace history. So the trace history
will contain free variables at the locations where you put /:count/.
This will be even more useful when automatic search of trace history is
built in (see ticket #2737). So a weaker form of Claus' idea can be
implemented by carefully selecting what should bet to the trace history.
Why a weaker form only? Well, in some cases, the variable instances in
trace log may not be the expected ones (they may be from a different
lexical scope). Experience with my code indicates this should be rare.
/:next/ is an idea how to implement a kind of step over. That is if by
step over you mean something else than steplocal. The non-lazy form of
/:next/ forces _result and does a /:step/. The lazy form forces only the
top level constructor of _result before the step. Hey, I even had a case
when it worked just like I expected. But typically it does not work
because of bug #1531. _result is not correctly bound to the result of
selected expression in most of the practical cases. This bug is also
critical for all the forms of conditional breakpoints. It would be cool
if we could specify the condition based on _result or some part of it.
The implementation of ghciExt conditional breakpoints would need to be
extended to support conditions on _result (in particular the breakpoint
would need to be disabled during the condition execution) but that is
easy to do. Even more worrying thing about bug #1531 is that it has the
milestone set to _|_.
It is easy to add /:enable/ and /:disable/ to support enabling and
disabling breakpoints. I just did not need it yet. Here is how a GhciExt
breakpoint looks like:
*Main> :show breaks
 Main a.hs:4:2-8 ":cmd return$GhciExt.getStopCode 0 (True) "putStr
\"(0): \"\n:force x" "False""
Just replace getStopCode with getDisabledStopCode and you have it
disabled. Return back to enable. Yeah, and implement getDisabledStopCode
which will just continue.
I added :/loglocal/ mostly to simulate how /:tracelocal/ in ticket #2737
would help. I was also trying how full tracing is helping. In both cases
the answer is: full tracing almost never helps. /:tracelocal/ from
ticket #2737 as proposed originally would rarely help. The problem is
that trace log gets overwhelmed with crap when we cannot control what
can be saved in it and what cannot be saved. My idea is that user should
be able to specify what can go in it and also what should not go in it.
Here is an alternative solution to the ones I proposed in tickets #2737
and #2946. I think this one would be best. The command to control the
tracing should look like:
-- should everything be traced?
:set trace (True|False)
-- scopes which should be traced (or should not be traced when ! is present)
:set trace ( (!)? scopeid )*
-- add/remove individual scopeids to/from the trace specification
:set trace (+|-) (!)? scopeid
where scopeid = ( conid . )* ( varid . )* varid
Notice how scopeid looks. It can have a sequence of varids at the end.
The reason is so that user can leave out a scope of a function which is
defined in a where clause. The scope specification is similar to the
proposal in ticket #3000. E.g. for this code:
fn s = 'a' : add s
where add = (++"z")
it could look like
:set trace Main.fn !Main.fn.add
meaning trace whole scope of *fn* but not the stuff in the scope of *add.
*Order should not be important, requests for not tracing should have
precedence before requests to trace.
The scopes which we want typically exclude are the ones which contain
loops. The loop content often fills in the trace log forcing the
interesting stuff out of it. It is better to investigate functions with
loops separately in nested context.
Notice that there is a bit difference between this proposal of
controlling trace content and the one in ticket #2737. #2737 assumes
usage of breakpoint arguments to specify a scope. The breakpoint
arguments give an ability to define scopes at finer level but there is
no option to define exclude scopes which I find important now.
The summary is: Trace log is as useful as much you can control what can
get in it. The :trace command looks to me like an error. It is better to
control it by allowing/disallowing scopes.
I also changed my opinion a bit about ticket #2945. :mergetrace would be
better than global trace history. Being able to investigate something
separately in a nested context is useful.
If I should order the ghci debugger related tickets then the order would
be like (more important first):
#1531 (_result can get bound to the wrong value in a breakpoint)
#2737 and #2946 (add :tracelocal to ghci debugger... and add command
#3000 (:break command should recognize also nonexported top level
#2803 (bring full top level of a module in scope when a breakpoint is
hit in the ...)
#1377 (the task: "We should print breakpoint related info only if
breakpoint has no commands set") but people debugging interactive
console applications would like to have this one the very top; *IIRC*
this may be easy to do, looks like all the printing is done in one
function (something like afterCmd???); also #2950 looked like trivial to
do (like 15 mins without the compile time???)
And the last thing: my first time experiences hacking in the /:loglocal/
into ghc. I cannot tell much, I spend with it only one long Sunday
afternoon, but here are my two points:
* I needed to extend ghc interface. The type of function GHC.resume
resume :: GhcMonad m => SingleStep -> m RunResult
resume :: GhcMonad m => (SrcSpan->Bool) -> SingleStep -> m RunResult
... plus the corresponding implementation change. The added argument
is a filtering function to limit source spans which can recorded in the
* It would be cool if ghci has its own dir in the souce tree where only
the ghci source files are. It would encourage people to hack it more
since it would be easier to maintain private patches and merging
upstream. It would be also easier to make sure one modifies only ghci
source code so that it works with other ghc releases.
Hopefully this helps somebody,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Glasgow-haskell-users