[Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)

Wed Mar 1 05:39:24 EST 2006

On Wed, 2006-03-01 at 01:36 +0000, Brian Hulley wrote:

> Currently all the ASCII editors I know of only do keyword highlighting, or 
> occasional ("wait a second while I'm updating the buffer") identifier 
> highlighting.

hIDE and Visual Haskell use the ghc lexer and get near-instantaneous
syntax highlighting. Because they use a proper lexer they get fully
accurate highlighting, not your ordinary "fairly close" regex-based
highlighting. They also only re-lex the bits needed. They keep the lexer
state for the start of each line and when a line changes, start
re-lexing from the beginning of that line and keep going until the lexer
ends up in the same state as a previously saved state on a line. I may
be wrong but I think there is an optimisation to not lex beyond the end
of the current screen until it is scrolled. This means that even when
someone types "{-"m you never have to re-lex & re-highlight more than a
screen full.

> What I'm trying to get however is complete grammatical 
> highlighting and type checking that is instantaneous as the user types code, 
> so this means that the highlighter/type checker needs a complete AST (with 
> 'gap' nodes to represent spans of incomplete/bad syntax) to work from.

With hIDE and Visual Haskell we have found it sufficient to do a
complete parse rather than do it incrementally. We wait for a second or
so after the user has stopped typing (since highlighting errors as
you're actually typing would just be annoying) and then run the ghc
front end. This is sufficiently fast (except perhaps on very large
modules).

> However it is way too expensive to re-parse the whole of a big buffer after 
> every keypress (I tried it with a parser written in C++ to parse a variant 
> of ML and even with the optimized build and as many algorithmic 
> optimizations as I could think of it was just far too slow, and I wasn't 
> even trying to highlight duplicate identifiers or do type inference)

It may be possible to do some more caching to speed things up without
going to a full incremental parser. For example the editor could
maintain a buffer of lexed symbols and have a traditional parser use
that. It may also be possible to just re-parse parts of the file.

> Thus to get a fast responsive editing environment which needs to maintain a 
> parse of the whole buffer to provide effective grammatical highlighting and 
> not just trivial keyword highlighting it seems (to me) to be essential to be 
> able to limit the effect of any keystroke especially when the user is just 
> typing text from left to right but where there may be more stuff after the 
> cursor eg if the user has gone back to editing a function at the top of a 
> file. Things like {- would mean that all the parse trees for everything 
> after it would have to be discarded. Also, flashing of highlighting on this 
> scale could be very annoying for a user, so I'd rather just delete this 
> particular possibility of the user getting annoyed when using my software 
> :-) thus my hopeless attempts to convince everyone that {- is bad news all 
> round :-)))

As I mentioned, it is possible to limit the effect of a {- to a screen
full of re-lexing. I grant you that it's likely to do worse things to
your incremental parser.

Duncan