[Haskell-cafe] getting more out of ghci [longish]

Tue Sep 25 11:52:23 EDT 2007

== intro

no, i'm not talking about using a nice frontend to ghci, with added
functionality, although the haskell modes for emacs and vim, and
other such gui/ide/editor efforts, are well worth looking into!-)

also, i'm not going to talk about the eagerly anticipated ghci debugger.

what i am going to talk about are some of the little things one can do
to improve the ghci user experience, including features asked for in
ghc's ticket tracker, features known and cherished from hugs, and
features that are not even available in hugs.

the keys to adding such useful features are :def, which allows us to
define new ghci commands, and :redir, which allows us to capture the
output of ghci commands for further processing. and to save you the
search: :def has been with us for a while (it is in ghci 6.6.1), and
:redir is not among the pre-defined ghci commands, not even in ghc head.

the reason i started looking into adding commands to ghci were (a) the
added functionality in ghci 6.7, which will (b) soon require me to be
able to capture ghci command output, for proper interfacing with my
haskell mode plugins for vim. originally, i thought that output capture
would need to be hacked into the ghci sources (i even drafted a
patch:-). it then became clear that we can implement :redir on top of
the existing commands, even in ghci 6.6.1!

all the commands we're going to introduce are defined in the file
dot-squashed.ghci, which can be found here:

http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/dot-squashed.ghci

(there is a patch pending for ghc head that will let us spread 
definitions in ghci over multiple lines, but for now, we have
to squash them into single lines..; i will take the liberty to violate 
this restriction in the explanations below), and work with ghci 6.6.1,
although there are more commands and information to play with in 
ghc 6.7 or later.

== command overview

so what kind of commands are we going to define? here's an overview:

  Prelude> :def s readFile
  Prelude> :s dot-squashed.ghci
  Prelude> :defs list

  :. <file>               -- source commands from <file>
  :pwd                    -- print working directory
  :ls [<path>]            -- list directory ("." by default)
  :redir <var> <cmd>      -- execute <cmd>, redirecting stdout to <var>
  :redirErr <var> <cmd>   -- execute <cmd>, redirecting stderr to <var>
  :grep <pat> <cmd>       -- filter lines matching <pat> from the output of <cmd>
  :find <id>              -- call editor (:set editor) on definition of <id>
  :b [<mod>]              -- :browse <mod> (default: first loaded module)
  :le [<mod>]             -- try to :load <mod> (default to :reload); edit first error, if any
  :defs {list,undef}      -- list or undefine user-defined commands

== let's go - simple things first

:def is commonly used to abbreviate frequently entered commands:

  Prelude> :grep :def :?

     :def <cmd> <expr>           define a command :<cmd>

where the <expr> is of type 'String -> IO String', so it takes a command
parameter (actually, the rest of the command line after :<cmd>), does
some IO, and returns a String, which is going to be interpreted as ghci
input. that's why our ':def s readFile' defined a :s command to source
ghci input from a file, which we then used to source our new command
definitions (alternatively, we could have put those into our .ghci file).

ghci 6.7 and later also have a :cmd command, which takes an expression
of type 'IO String' - that expression is immediately executed, and its
result String is interpreted as ghci input. we'll define our own version
for ghci 6.6.1:

  -- 6.6.1 doesn't have this, omit this def for later ghci's
  :def cmd \l->return $ unlines [":def cmdTmp \\_->"++l,":cmdTmp",":undef cmdTmp"]

we simply use the expression parameter to define a temporary command,
which we execute immediately and undefine afterwards.

== being helpful and self-documenting

ghci only keeps an internal record of commands added with :def, so we
can neither get a list of them, nor will they appear in :? output. we
can be more helpful by wrapping each command with a line of help:

  -- make commands helpful
  let { cmdHelp cmd msg "--help" = return $ "putStrLn "++show msg
      ; cmdHelp cmd msg other    = cmd other }

then we can define a few standard utilities, with help texts:

  :def . cmdHelp readFile ":. <file>\t\t-- source commands from <file>"

  let pwd _ = return "System.Directory.getCurrentDirectory >>= putStrLn"

  :def pwd cmdHelp pwd ":pwd\t\t\t-- print working directory"

  let ls p  = return $ "mapM_ putStrLn =<< 
                          System.Directory.getDirectoryContents "++show path 
              where {path = if (null p) then "." else p}

  :def ls cmdHelp ls ":ls [<path>]\t\t-- list directory (\".\" by default)"

sourcing commands saves key-strokes on a frequently used operation,
while :pwd and :ls save me from having to recall whether i'm using ghci
in windows or unix, by using haskell functions;-) we'll see later how we
register our commands with :defs, but each of these commands responds to
calls for --help:

  Prelude> :. --help
  :. <file>               -- source commands from <file>

so, once registered, we can let :defs give us --help for all of them:

  Prelude> :grep :\.|:pwd|:ls :defs list

  :. <file>               -- source commands from <file>
  :pwd                    -- print working directory
  :ls [<path>]            -- list directory ("." by default)

== a hammer for many nails: capturing ghci command output

now for the big tool: :def is fine, and there are several ghci commands
that give useful information to the user, such as :?, :show modules,
:browse, etc. if we could only get hold of that information, we could
define much more interesting new ghci commands.

the plan is to redirect stdout to a temporary file, execute one of those
helpful ghci commands, restore stdout to the terminal, and read the
contents of the temporary file into a variable (then clean away the
temporary file).

unfortunately, there's no portable redirection functionality in the
standard libs, but in ghci, we're ghc-dependent anyway, and ghc provides
us with GHC.Handle. now, take a deep breath, 'cause here we go:

  let redir varcmd = 
       case break Data.Char.isSpace varcmd of 
       { (var,_:cmd) -> return $ unlines 
          [":set -fno-print-bind-result" 
          ,"tmp <- System.Directory.getTemporaryDirectory"
          ,"(f,h) <- System.IO.openTempFile tmp \"ghci\""
          ,"sto <- GHC.Handle.hDuplicate System.IO.stdout"
          ,"GHC.Handle.hDuplicateTo h System.IO.stdout"
          ,"System.IO.hClose h"
          ,cmd
          ,"GHC.Handle.hDuplicateTo sto System.IO.stdout"
          ,"let readFileNow f = readFile f >>= \\t->length t `seq` return t"
          ,var++" <- readFileNow f"
          ,"System.Directory.removeFile f"]
        ; _ -> return "putStrLn \"usage: :redir <var> <cmd>\"" }

  :def redir cmdHelp redir 
    ":redir <var> <cmd>\t-- execute <cmd>, redirecting stdout to <var>"

okay, that was a handful, and it doesn't look easier on a single line.

our :redir command takes two parameters: the name of a variable to bind
the output to, and the name of a command to capture the output of. so we
split the command line into var and cmd, and we complain with usage info
if that fails (unchecked failure in ghci command definitions is
generally a bad idea). the rest is fairly straightforward, actually, but
for the tedious inconvenience of keeping the different levels of
interpretation and scopes in mind: we're using the current scope to
construct strings that represent commands which will be executed in the
scope in which :redir will be called. also, we need to fully qualify our
variables, to be sure we can get the right ones, no matter what module
is loaded when the command is executed (strictly speaking, we should
qualify the remaining functions as well, to avoid pathological cases
like "import Prelude()" -- please keep that in mind if you do redefine
or hide any of those standard functions).

we don't want output from our auxiliary variable bindings, so we :set
-fno-print-bind-result; then we get a temporary file f, with handle h,
make a copy of stdout, then redirect stdout to h; we insert the cmd we
want to run (note the lack of quotes); afterwards, we restore stdout,
read the temporary file and bind its contents to var (note again the
quoting), before we clear away the temporary file.

and what do we get for all that trouble?

  Prelude> :redir --help
  :redir <var> <cmd>      -- execute <cmd>, redirecting stdout to <var>

  Prelude> :redir x :type id

  Prelude> x
  "id :: a -> a\n"

  Prelude> putStrLn x
  id :: a -> a

not much, yet, but that is a very useful hammer!-) and if :redir is our
hammer, the question becomes: what are our nails?

== filtering command output

we've already seen several uses of :grep, so let's deal with that next:

  let { merge []                    = [] 
      ; merge (l:c:ls) | i c > i l  = merge ((l++c):ls)
        where {i l = length (takeWhile Data.Char.isSpace l)}
      ; merge (l:ls)                = l:merge ls
      ; grep patcmd = 
        case break Data.Char.isSpace patcmd of 
        { (pat,_:cmd) -> return $ unlines 
          [":redir out "++cmd
          ,"let ls = "++if ":browse" `Data.List.isPrefixOf` cmd 
                        then "merge (lines out)" else "lines out"
          ,"let match pat = Data.Maybe.isJust . 
                            Text.Regex.matchRegex (Text.Regex.mkRegex pat)"
          ,"putStrLn $ unlines $ (\"\":) $ filter (match "++show pat++") $ ls"]
        ; _ -> return "putStrLn \"usage: :grep <pat> <cmd>\"" } }

  :def grep cmdHelp grep 
    ":grep <pat> <cmd>\t-- filter lines matching <pat> from the output of <cmd>"
  -- (also merges pretty-printed lines if <cmd> is :browse)

another handful, but not all that much if we focus on the grep function:
again, we split the commandline into pat and cmd (note that this
simple-minded approach doesn't permit spaces in pat); we then run the
cmd, capturing its output in the variable out, and apply a simple filter
to the lines in out, based on matching the regular expression pattern
pat; that's it (oh, and if cmd happens to be :browse, we undo the
pretty-printer's attempt to spread information over multiple lines, 
which would interfere with our line-based filtering).

now, this is more obviously useful, isn't it?-) we can :grep for
module-related commands in :?:

  Prelude> :grep mod :?

     :add <filename> ...         add module(s) to the current target set
     :browse [*]<module>         display the names defined by <module>
     :edit                       edit last module
     :load <filename> ...        load module(s) and their dependents
     :module [+/-] [*]<mod> ...  set the context for expression evaluation
     :reload                     reload the current module set
     :show modules               show the currently loaded modules

or find out what folds there are in the Prelude (this is similar to
hugs' :names command, btw):

  Prelude> :grep fold :browse Prelude

  foldr :: (a -> b -> b) -> b -> [a] -> b
  foldl1 :: (a -> a -> a) -> [a] -> a
  foldr1 :: (a -> a -> a) -> [a] -> a
  foldl :: (a -> b -> a) -> a -> [b] -> a

or what monadic commands in the Prelude operate on lists (this is no
replacement for hoogle, but useful for finding simple matches):

  Prelude> :grep Monad.*\[.*\] :browse Prelude

  sequence_ :: (Monad m) => [m a] -> m ()
  sequence :: (Monad m) => [m a] -> m [a]
  mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()
  mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]

== for our next trick: more hugs commands for ghci

=== :find

we've already included the :names functionality, but my favourite hugs
command must be :find <name>, which edits the module containing the
definition of <name>. this is even more useful as hugs comes with
sources for Prelude and libs, so you can simply say ':find span' and 
see span's standard definition. but, let's at least define a ghci :find
for modules for which we do have the sources!

the plan is to capture the output of :info <name>, grep for those
helpful comments "-- Defined at <file>:<line>:<col>:", which tell us
where to look, then call :edit with that information. for that to work,
we need an editor that can open a file at a specified line:

  -- if your editor doesn't understand :e +line file 
  -- (jump to line in file), you'll need to change 
  -- functions find and loadEditErr below
  :set editor gvim

now, for the find functionality:

  let find id = return $ unlines 
      [":redir out :info "++id
      ,"let ls = filter (Data.List.isInfixOf \"-- Defined\") $ lines out"
      ,"let match pat = Text.Regex.matchRegex (Text.Regex.mkRegex pat)"
      ,"let m = match \"-- Defined at ([^<:]*):([^:]*):\" $ head ls"
      ,":cmd return $ case (ls,m) of 
                      { (_:_,Just [mod,line]) -> (\":e +\"++line++\" \"++mod)
                      ; _ -> \"\" }"]

  :def find cmdHelp find 
    ":find <id>\t\t-- call editor (:set editor) on definition of <id>"

that's fairly easy by now, isn't it?-) we capture the output of :info,
grab the definition location, if any, and call :e. the latter is a bit
tricky because we need to compose the :e command with the information we
have obtained from :info. we achieve this by an extra level of
interpretation, passing our constructed ':e +line file' command to :cmd.

note again that :find will work only where we have the sources. for
instance, we don't have the sources for Prelude.span, so :find span
wouldn't work - :info span doesn't list a source file, only a module:

  Prelude> :info span
  span :: (a -> Bool) -> [a] -> ([a], [a])        -- Defined in GHC.List

but load a module of your own, then try ':find main' or something!-)

=== #1468: :browse should browse currently loaded module

another thing that hugs does is default to browsing the current module,
if :browse is called with no explicit module parameter. this has been
asked for in a ghci ticket.

by now, you're probably ahead of me, looking for ghci commands to grab
the current module from?-) you're looking for :show modules, and the 
implementation of :b is indeed straightforward:

  let { b browse "" = return $ unlines 
        [":redir out :show modules"
        ,":cmd case lines out of 
               { (l:_) -> return ("++show browse++"++head (words l))
               ; _ -> return \"\" }"]
      ; b browse m  = return (browse++m) }

  :def b  cmdHelp (b ":browse ") 
    ":b [<mod>]\t\t-- :browse <mod> (default: first loaded module)"

i pass the browse command to use because my ghci has two versions (also
in that patch pending for ghc head), and you might want to protect that
call to head. but otherwise, no need to wait for someone to hack the
ghci sources to fix that ticket!-)

=== #95: GHCi editor binding with ":e"

the ghci ticket that asked for adding the hugs :edit command also asked
for more of hugs :edit functionality: after failing to load a file, :e
will edit the first error location. can we do that without fixing ghci?

yes, we can, but there is a slight obstacle to overcome. we could make a
variant of :redir that redirects stderr instead of stdout, but we want
to apply it to commands like :load/:reload, to capture their error
reports, if any. unfortunately, those commands clear the bindings in
scope, which means that our cleanup operation would fail to restore
stderr to the terminal, not to mention reading the tempfile contents
into a variable. we need to be a bit more careful.

fortunately, :l/:r do not clear the current commands, so if we manage to
capture our handle and filename variables in a temporary command, we can
execute that to finish processing after calling :l/:r. so, here we go:

  let redirErr varcmd = case break Data.Char.isSpace varcmd of 
      { (var,_:cmd) -> return $ unlines 
        [":set -fno-print-bind-result"
        ,"tmp <- System.Directory.getTemporaryDirectory"
        ,"(f,h) <- System.IO.openTempFile tmp \"ghci\""
        ,"ste <- GHC.Handle.hDuplicate System.IO.stderr"
        ,"GHC.Handle.hDuplicateTo h System.IO.stderr"
        ,"System.IO.hClose h"
        ,"let readFileNow f = readFile f >>= \\t->length t `seq` return t"
        ,"let afterCmd _ = do { GHC.Handle.hDuplicateTo ste System.IO.stderr
                              ; r <- readFileNow f
                              ; System.Directory.removeFile f
                              ; return $ \""++var++" <- return \"++show r }"
        ,":def afterCmd afterCmd"
        , cmd
        , ":afterCmd"
        , ":undef afterCmd" ]
      ; _ -> return "putStrLn \"usage: :redirErr <var> <cmd>\"" }

  :def redirErr cmdHelp redirErr 
    ":redirErr <var> <cmd>\t-- execute <cmd>, redirecting stderr to <var>"

if you compare redirErr with redir, you'll notice that we've only
exchanged stderr for stdout, and moved the commands after cmd into a
temporary :afterCmd, which slightly complicates the quoting. otherwise,
the plan is unchanged.

we're now in a position to handle error location editing:

  let loadEditErr m = return $ unlines 
      [if null m then ":redirErr err :r" else ":redirErr err :l "++m
      ,"let match pat = Text.Regex.matchRegex (Text.Regex.mkRegex pat)"
      ,"let ms = Data.Maybe.catMaybes $ 
                 map (match \"^([^:]*):([^:]*):([^:]*):\") $ lines err"
      ,":cmd return $ case ms of 
                      { ([mod,line,col]:_) -> (\":e +\"++line++\" \"++mod)
                      ; _ -> \"\" }"]

  :def le cmdHelp loadEditErr 
    ":le [<mod>]\t\t-- try to :load <mod> (default to :reload); edit first error, if any"

depending on whether we use :le with an explicit module to load or not,
we use :load or :reload, capture the error output, filter it for error
locations (<file>:<line>:<col>:), and call the editor with that
information for the first error, if any.

we could even make ghci wait for the editor to close, then loop until
all errors have been handled. but that kind of edit/compile loop is
better handled inside your editor (in vim, :help quickfix).

== wrapping up, keeping a list

finally, let's keep a record of the command we've added, so that we can
find out what they are and what they do:

  let { cmds = [".","pwd","ls","redir","redirErr","grep","find","b","le","defs"]
      ; defs "list"  = return $ unlines $ 
        "putStrLn \"\"": [":"++cmd++" --help"| cmd <- cmds]++ ["putStrLn \"\""]
      ; defs "undef" = return $ unlines [":undef "++cmd| cmd <- cmds]
      ; defs _       = return "putStrLn \"usage: :defs {list,undef}\"" }

  :def defs cmdHelp defs 
    ":defs {list,undef}\t-- list or undefine user-defined commands"

we simply list the commands manually (we could make them
self-registering by keeping an IORef somewhere, but let's keep things
simple for now), and provide an administrative interface (:defs) to list
(call --help for all) or undefine (useful if you edit your .ghci file
and want to reload it) all our commands. that's how we were able to type
':defs list' at the beginning of this session to get our overview of
available user-defined commands.

i hope you enjoyed this little tour of ghci, and find the proposed
commands and techniques useful.

claus