RFC: style cleanup & guidelines for GHC, and related bikeshedding

Simon Peyton Jones simonpj at microsoft.com
Thu Jul 3 16:29:40 UTC 2014


Just to say that 

* In general I don't have a strong opinion about these stylistic issues.
  Moreover I have little bikeshed time, and if I don't contribute to 
  a debate I can't expect to influence it much.  So I'm mostly happy to 
  accept a consensus view.

However, some thoughts

* I don't think we should be over-prescriptive.  Eg personally I like a
  do-notation style with semicolons at the beginning, eg
      do { blah
         ; x <- foo
         ; etc }
  but I'm disinclined to force everyone to do so, and I don't want to be
  forced to do something different myself.  I can adapt to the style of the
  author here.

* A *primary* form of consumption is the source code itself.  I've found that
  Haddock-compliant comments can be rather less readable in source code.
  (Eg. CoreSyn.lhs where the #blah# notation coexists uneasily with Note [blah].)
  So I'd be nervous of mandating Haddock-compliance.

* Insisting on line comments exclusively, carries a cost.  The on-screen distraction 
  of line comments, and the nuisance of writing them, is not trivial.  Perhaps it
  is bearable, but it's non-zero.  See example below. 

Simon


With line comments

-- Note [Extra dependencies from .hs-boot files]
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Consider the following case:
-- 
--   module A where
--     import B
--     data A1 = A1 B1
-- 
--   module B where
--     import {-# SOURCE #-} A
--     type DisguisedA1 = A1
--     data B1 = B1 DisguisedA1
-- 
-- We do not follow type synonyms when building the dependencies for each datatype,
-- so we will not find out that B1 really depends on A1 (which means it depends on
-- itself). To handle this problem, at the moment we add dependencies to everything
-- that comes from an .hs-boot file. But we don't add those dependencies to
-- everything. Imagine module B above had another datatype declaration:
-- 
--   data B2 = B2 Int
-- 
-- Even though B2 has a dependency (on Int), all its dependencies are from things
-- that live on other packages. Since we don't have mutual dependencies across
-- packages, it is safe not to add the dependencies on the .hs-boot stuff to B2.
-- 
-- See also Note [Grouping of type and class declarations] in TcTyClsDecls.

With block comments

Note [Extra dependencies from .hs-boot files]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider the following case:

  module A where
    import B
    data A1 = A1 B1

  module B where
    import {-# SOURCE #-} A
    type DisguisedA1 = A1
    data B1 = B1 DisguisedA1

We do not follow type synonyms when building the dependencies for each datatype,
so we will not find out that B1 really depends on A1 (which means it depends on
itself). To handle this problem, at the moment we add dependencies to everything
that comes from an .hs-boot file. But we don't add those dependencies to
everything. Imagine module B above had another datatype declaration:

  data B2 = B2 Int

Even though B2 has a dependency (on Int), all its dependencies are from things
that live on other packages. Since we don't have mutual dependencies across
packages, it is safe not to add the dependencies on the .hs-boot stuff to B2.

See also Note [Grouping of type and class declarations] in TcTyClsDecls.

| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Jan
| Stolarek
| Sent: 03 July 2014 09:45
| To: ghc-devs at haskell.org
| Subject: Re: RFC: style cleanup & guidelines for GHC, and related
| bikeshedding
| 
| I fully support Austin's proposal. My eyes hurt when I work on 5 files
| and each of them is written
| in a different style.
| 
| Now, to address a few points that were raised.
| 
| > Is it just for the sake of beauty (not to diminish the importance of
| beauty)?
| 
|   * I believe that trailing whitespaces are a practical issue: they are
| invisible to the human eye
| (unless you have (setq-default show-trailing-whitespace t) \n (setq-
| default indicate-empty-lines
| t) in your .emacs file) but carry a semantic meaning for git and other
| version control systems.
| This means that accidental removal of a trailing whitespace - which can
| and will happen if you
| don't highlight them - will lead to false changes in the diff.
| That said, simply removing existing trailing whitespaces is not enough -
| we would need a way to
| keep them from reappearing. Sadly, this idea was rejected:
| http://www.haskell.org/pipermail/ghc-devs/2013-August/002074.html
| Git has some cool tools (like git diff --check) that aid programmer in
| dealing with trailing
| whitespaces, but not everyone uses them. Here's a relatively recent
| example of a new trailing
| whitespace sneaking into the source code:
| 
| https://github.com/ghc/ghc/commit/ce19d5079ea85d3190e837a1fc60000fbd82134
| d#diff-ababf87bf3da1f44484a901a8fbc0eb6R388
| 
| So without a way to enforce this policy removing trailing whitespaces
| doesn't seem to make sense.
| 
|   * Tabs will become a practical problem once #9230 is fixed. Also, tab
| width can be custom-set to
| whatever value in an editor, which can trip some people. I for one used
| to have tab width set to
| 2 (unlike the default of 4 or 8) but gave up, because too many tab-
| formatted things were
| displayed incorrectly.
| 
|   * I strongly support turning lhs files into hs files. This is practical
| and explained below.
| 
| From my point of view other things are just pure aesthetics.
| 
| ======
| 
| I'm not sure about the style of do-notation. When I first saw do { ... ;
| ... } I thought it was
| terrible but now, after some time, I see the merit of it. Consider this
| code snippet from
| StgCmmBind:
| 
|             \(_offset, node, arg_regs) -> do
|                 { (...)
|                 ; withSelfLoop (bndr, loop_header_id, arg_regs) $ do
|                 { (...)
|                 ; entryHeapCheck cl_info node' arity arg_regs $ do
|                 {
|                 (...)
|                 }}}
| 
| I believe that without { ... ; ... } these nested dos would have to be
| indented, which wouldn't
| aid readability IMO. I agree that we should use one style of the do-
| notation but I'm not sure
| which one should that be. I'd favor the { ... ; ... } one.
| 
| =====
| 
| > So, overall, I kind of like our current policy of fixing tabs and
| white-space when we have to
| modify the code anyway.
| I proposed to remove tabs from the source code 1,5 year ago:
| 
| http://www.haskell.org/pipermail/ghc-devs/2013-January/000053.html
| 
| Back then my conclusion was that most of the files that contain tabs were
| modified in the previous
| 6 months. This means that people are not following the policy to remove
| tabs when they modify the
| file (or at least they weren't following that policy back then). So this
| policy seems to be only
| theory.
| 
| ======
| 
| One more style issue that was not mentioned by Austin are block comments:
| I strongly advocate
| removal of all block comments in favour of single-line ones. Why? Here's
| a reason:
| 
| {-
| Note [Blah blah]
| ~~~~~~~~~~~~
| ... Code snippet:
|   foo :: Foo -> Bar
|   foo = ...
| -}
| 
| My primary method of finding things in the source code is grepping and I
| believe that many people
| do the same. With block comments it is impossible to tell whether the
| line found by grep is part
| of the comment. With line comments it is immediately obvious. Moreover,
| if all comments were line
| comments it would be really easy to grep for something only in the
| comments. Currently this is
| impossible. This is also true for lhs files and for that reason I also
| consider lhs files to be
| practical issue.
| Another reason (albeit minor) to have line comments instead of block
| comments is that with the
| former style each comment is independent and can be freely moved around.
| With block comments we
| might need to create extra comment blocks when moving comments around.
| 
| ======
| 
| Now, I understand people who don't want such change because of merge
| conflicts. But the truth is
| there will never be a good moment to implement such changes because there
| is always some ongoing
| work and outstanding branches. So I believe we should think whether these
| changes move us in a
| good direction or not and if we decide that these changes are a Good
| Thing - which I believe they
| are - we should bite the bullet. Otherwise we will have mess in the
| source code forever.
| 
| Since I'm advocating strongly for these I am of course willing to put my
| work into making this
| happen. So far I've assigned #9230 to myself and if we agree to detab all
| the source code I can
| be the person to do it.
| 
| Janek
| 
| PS. I am proud of myself, because I wrote a mail as long as Austin's,
| which I always thought was
| impossible :-)
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list