RFC: style cleanup & guidelines for GHC, and related bikeshedding

Jan Stolarek jan.stolarek at p.lodz.pl
Thu Jul 3 08:44:45 UTC 2014


I fully support Austin's proposal. My eyes hurt when I work on 5 files and each of them is written 
in a different style.

Now, to address a few points that were raised.

> Is it just for the sake of beauty (not to diminish the importance of beauty)?

  * I believe that trailing whitespaces are a practical issue: they are invisible to the human eye 
(unless you have (setq-default show-trailing-whitespace t) \n (setq-default indicate-empty-lines 
t) in your .emacs file) but carry a semantic meaning for git and other version control systems. 
This means that accidental removal of a trailing whitespace - which can and will happen if you 
don't highlight them - will lead to false changes in the diff. 
That said, simply removing existing trailing whitespaces is not enough - we would need a way to 
keep them from reappearing. Sadly, this idea was rejected: 
http://www.haskell.org/pipermail/ghc-devs/2013-August/002074.html
Git has some cool tools (like git diff --check) that aid programmer in dealing with trailing 
whitespaces, but not everyone uses them. Here's a relatively recent example of a new trailing 
whitespace sneaking into the source code:

https://github.com/ghc/ghc/commit/ce19d5079ea85d3190e837a1fc60000fbd82134d#diff-ababf87bf3da1f44484a901a8fbc0eb6R388

So without a way to enforce this policy removing trailing whitespaces doesn't seem to make sense.

  * Tabs will become a practical problem once #9230 is fixed. Also, tab width can be custom-set to 
whatever value in an editor, which can trip some people. I for one used to have tab width set to 
2 (unlike the default of 4 or 8) but gave up, because too many tab-formatted things were 
displayed incorrectly.

  * I strongly support turning lhs files into hs files. This is practical and explained below.

From my point of view other things are just pure aesthetics.

======

I'm not sure about the style of do-notation. When I first saw do { ... ; ... } I thought it was 
terrible but now, after some time, I see the merit of it. Consider this code snippet from 
StgCmmBind:

            \(_offset, node, arg_regs) -> do
                { (...)
                ; withSelfLoop (bndr, loop_header_id, arg_regs) $ do
                { (...)
                ; entryHeapCheck cl_info node' arity arg_regs $ do
                {
                (...)
                }}}

I believe that without { ... ; ... } these nested dos would have to be indented, which wouldn't 
aid readability IMO. I agree that we should use one style of the do-notation but I'm not sure 
which one should that be. I'd favor the { ... ; ... } one.

=====

> So, overall, I kind of like our current policy of fixing tabs and white-space when we have to 
modify the code anyway.
I proposed to remove tabs from the source code 1,5 year ago:

http://www.haskell.org/pipermail/ghc-devs/2013-January/000053.html

Back then my conclusion was that most of the files that contain tabs were modified in the previous 
6 months. This means that people are not following the policy to remove tabs when they modify the 
file (or at least they weren't following that policy back then). So this policy seems to be only 
theory.

======

One more style issue that was not mentioned by Austin are block comments: I strongly advocate 
removal of all block comments in favour of single-line ones. Why? Here's a reason:

{-
Note [Blah blah]
~~~~~~~~~~~~
... Code snippet:
  foo :: Foo -> Bar
  foo = ...
-}

My primary method of finding things in the source code is grepping and I believe that many people 
do the same. With block comments it is impossible to tell whether the line found by grep is part 
of the comment. With line comments it is immediately obvious. Moreover, if all comments were line
comments it would be really easy to grep for something only in the comments. Currently this is 
impossible. This is also true for lhs files and for that reason I also consider lhs files to be 
practical issue.
Another reason (albeit minor) to have line comments instead of block comments is that with the 
former style each comment is independent and can be freely moved around. With block comments we 
might need to create extra comment blocks when moving comments around.

======

Now, I understand people who don't want such change because of merge conflicts. But the truth is 
there will never be a good moment to implement such changes because there is always some ongoing 
work and outstanding branches. So I believe we should think whether these changes move us in a 
good direction or not and if we decide that these changes are a Good Thing - which I believe they 
are - we should bite the bullet. Otherwise we will have mess in the source code forever. 

Since I'm advocating strongly for these I am of course willing to put my work into making this 
happen. So far I've assigned #9230 to myself and if we agree to detab all the source code I can 
be the person to do it.

Janek

PS. I am proud of myself, because I wrote a mail as long as Austin's, which I always thought was 
impossible :-)


More information about the ghc-devs mailing list