<div dir="ltr">That has a high chance of backfiring and requiring everyone to use do { ...; ... } with explicit braces and semis. ;)<div><br></div><div>-Edward</div><div class="gmail_extra"><br><br><div class="gmail_quote">

On Wed, Jul 2, 2014 at 4:08 AM, Simon Marlow <span dir="ltr"><<a href="mailto:marlowsd@gmail.com" target="_blank">marlowsd@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Agreed, let's do it.  Thanks for the well-argued proposal.<br>

<br>

Next up: consistent style :-)<br>

<br>

Cheers,<br>

Simon<div class=""><br>

<br>

On 27/06/2014 10:51, Johan Tibell wrote:<br>

</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">

Hi!<br>

<br>

I found myself exploring new parts of the GHC code base the last few<br>

weeks (exciting!), which again reminded me of my biggest frustration<br>

when working on GHC: the lack of per-function/type (Haddock) comments.<br>

<br>

GHC code is sometimes commented with "notes", which are great but tend<br>

to (1) mostly cover the exceptional cases and (2) talk about the<br>

implementation of a function, not how a caller might use it or why.<br>

<br>

Lack of documentation, in GHC and other software projects, usually has<br>

(at least) two causes:<br>

<br></div>

  * Programmers comment code they think is "complex enough to warrant a<div class=""><br>

    comment". The problem is that the author is usually a poor judge of<br>

    what's complex enough, because he/she is too familiar with the code<br>

    and tends to under-document code when following this principle.<br></div>

  * Documenting is boring and tends to have little benefit the person<div class=""><br>

    writing to documentation. Given lack of incentives we tend to<br>

    document less than we ought to.<br>

<br>

I've only seen one successful way to combat the lack of documentation<br>

that stems from the above: have the project's style guide mandate that<br>

top-level functions and types (or at least those that are exported) have<br>

documentation. This works well at Google.<br>

<br>

Anecdote: we have one code base inside Google that was until recently<br>

exempt from this rule and documentation is almost completely absent in<br>

that code base, even though hundreds of engineers work on and need to<br>

understand it every day. This breeds institutional knowledge problems<br>

i.e. if the author of a core piece of code leaves, lots of knowledge is<br>

lost.<br>

<br></div>

*Proposal: *I propose that we require that new top-level functions and<div class=""><br>

types have Haddock comments, even if they start out as a single, humble<br>

sentence.<br>

<br>

I've found that putting even that one sentence (1) helps new users and<br>

(2) establishes a place for improvements to be made. There's a strong<br>

"broken window" effect to lack of comments, in that lack of comments<br>

breeds more lack of comments as developers follow established practices.<br>

<br>

We should add this requirement to the style guide. Having it as a<br>

written down policy tends to prevent having to re-hash the whole<br>

argument about documentation over and over again. This has also helped<br>

us a lot at Google, because programmers can spend endless amount of time<br>

arguing about comments, placement of curly braces, etc. and having a<br>

written policy helps cut down on that.<br>

<br>

To give an idea of how to write good comments, here are two examples of<br>

undocumented code I ran into in GHC and how better comments would have<br>

helped.<br>

<br></div>

*First example*<div class=""><br>

In compiler/nativeGen/X86/Instr.<u></u>hs there's a (local) function called<br>

mkRUR, which is a helper function use when computing instruction<br>

register usage.<br>

<br>

The first question that I asked upon seeing uses of that function was<br>

"what does RUR stand for?" Given the context the function is in, I<br>

guessed it stands for read-update-read, because R is used to mean "read"<br>

in the enclosing function and "updating" is related to "reading" so that<br>

must be what U stands for. It turns out that it stands for<br>

RegUsageReadonly. Here's a comment that would have captured, in a single<br>

sentence, what this function is for:<br>

<br>

     -- | Create register usage info for instruction that only<br>

     -- reads registers.<br>

     mkRUR src = src' `seq` RU src' []<br>

         where src' = filter (interesting platform) src<br>

<br>

That already a big improvement. A note about the register filtering,<br>

which means that not all registers you pass to the function will be<br>

recorded as being read in the end, could also be useful.<br>

<br>

Aside: providing a type signature, which would have made it clear that<br>

the return type is RU, might also have helped in this particular case.<br>

<br></div>

*Second example*<div class=""><br>

In the same file there a function called x86_regUsageOfInstr. It's the<br>

function that encloses the local function mkRUR above.<br>

<br>

I could figure out that this function has something to do with register<br>

usage, of the instruction passed as an argument, and that register usage<br>

is important for the register allocator. However, trying to understand<br>

in more detail what that meant was more of challenge than it needed to<br>

be. First, a comment more clearly explaining what computing register<br>

usage means in practice would be helpful:<br>

<br>

     -- | Returns which registers are read and written by this<br>

     -- instruction, as a (read, written) pair. This info is used<br>

     -- by the register allocator.<br>

     x86_regUsageOfInstr :: Platform -> Instr -> RegUsage<br>

<br>

The reason mentioning that the return value is essentially a (read,<br>

written) pair is helpful is because the body of the function a big case<br>

statement full of lines like this one:<br>

<br>

     GCMP _ src1 src2 -> mkRUR [src1,src2]<br>

     ...<br>

     FDIV _ src  dst  -> usageRM src dst<br>

<br>

It's not immediately clear that all the various helper functions used<br>

here just end up computing a pair of the above form. A top-level comment<br>

lets you understand what's going on without understanding exactly what<br>

all these helper functions are doing.<br>

<br>

Thoughts?<br>

<br>

-- Johan<br>

<br>

<br>

<br></div><div class="">

______________________________<u></u>_________________<br>

ghc-devs mailing list<br>

<a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>

<a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/ghc-devs</a><br>

<br>

</div></blockquote><div class="HOEnZb"><div class="h5">

______________________________<u></u>_________________<br>

ghc-devs mailing list<br>

<a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>

<a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/ghc-devs</a><br>

</div></div></blockquote></div><br></div></div>