Writing Comments

Lecture Notes for CS 190
Spring 2015
John Ousterhout

Code alone can't represent cleanly all the information needed to use and maintain a module
In other cases cases, information could be deduced from code, but that would be time-consuming:
In-code comments provide this extra information; they are an essential tool in managing complexity
Comments make software appear simpler:
- Describe the (simple) interface to a module whose implementation is complicated
Comments are still controversial (!)
- A significant fraction of all commercial code (50%?) is uncommented
- Excuses:
  - "This code is self-documenting"
  - "Comments get out of date and become misleading"
  - "I don't have time to write comments"
  - "The comments I have seen are worthless; why bother?"
Kinds of comments:
- Interface documentation (e.g. classes and methods)
- Individual variables: instance variables for a class, method parameters, method results, local variables
- Implementation documentation (e.g. comments inside a method)
- Cross-module design decisions (e.g. common protocols)
Comments should describe things that are not obvious from the code.
- Mistake #1: comments duplicate code (see slides)
- Mistake #2: non-obvious info is not described
  - Low-level details:
    - Units for a variable
    - Invariants
    - Special cases (what does null mean?)
  - Higher-level information:
    - Rationale for the current design: why the code is this way.
    - How to choose the value of a configuration parameter.
    - Abstractions: a higher-level, more intuitive, description of what the code is doing.
Two kinds of documentation for classes and methods:
- Interface: what someone needs to know in order to use this class or method
- Implementation: how the method or class works internally to implement the advertised interface.
- Important to separate these: do not describe the implementation in the interface documentation!
Interface documentation:
- Put immediately before the class or method declaration
- Goal: create simple, intuitive model for users
- Simpler is better
- Complete: must include everything that any user might need to know
- Interface description may use totally different terms than the implementation (if they are simpler)
Example: index range lookup
- Large table of objects for a Web site, split across dozens of servers
- Table has indexes for looking up objects by certain attributes (name, salary, etc.)
- IndexLookup class retrieves a range of values in index order
```
query = new IndexLookup(index, key1, key2);
while (true) {
    object = query.getNext();
    if (object == null) {
        break;
    }
    ...
}
```
Documentation for variables and return values: be very specific
- Exactly what is this thing?
- What are the units?
- Boundary conditions
  - Does "end" refer to the last value, or the value after the last one?
  - Is a null pointer value allowed? If so, what does it mean?
- If memory is dynamically allocated, who is responsible for freeing it?
- Invariants?
Implementation documentation (comments inside methods):
- For many methods, not needed
- For longer methods, document major blocks of code
  - Describe what's happening at a higher level
  - E.g., what does each loop iteration do?
- Documenting variables is less important for method local variables (can see all of the uses), but sometimes needed for longer methods or tricky variables.
Documenting cross-module design decisions:
- Example: network protocol
- Example: how does the system deal with zombie servers?
- Challenging:
  - No single rational place to put the documentation (people won't know where to look for it)
  - Don't want to repeat everywhere
- One possible approach:
  - Create designNotes file, with various tagged sections
    - "Zombies"
    - "Timing-dependent tests"
  - In the code, just refer to the design notes file:
```
// See "Zombies" in designNotes.
```
To maximize value of comments:
- It must be easy for people to find the right documentation at the right time
- The documentation must get updated as the code changes
Techniques:
- Document each thing exactly once: don't duplicate documentation (it won't get maintained)
  - Use references rather than repeating documentation: "See documentation for xyz method".
- Put documentation as close as possible to the relevant code
  - Next to variable and method declarations
  - Push in-method documentation down to the tightest enclosing context
- Don't say anything more in documentation than you need to
  - e.g., don't use comments in one place to describe design decisions elsewhere
  - Higher-level comments are less likely to become obsolete
- Look for "obvious" locations where people can easily find documentation (see Status example in slides)
Best comments are those written at the beginning:
- I write class comments, method headers (signature and comments) before writing the bodies of methods
- Helps me to define the overall APIs, juggle functionality between methods
- As I write and test code, can revise the comments to make them better and better
Counter-arguments:
- "Why waste time writing comments when the code is still changing?"
- "Once I get the code done, I promise to write all the comments"
- Problem #1: you won't go back and write the comments later
- Problem #2: if you do, the comments will be bad:
  - You're in a hurry and emotionally checked out
  - You have forgotten many of the design decisions, subtleties
  - Comments will be a superficial duplication of what's obvious from the code

Name choice is an important form of documentation
- Take time to think of names that are clear and unambiguous
- Be specific (but not too long)!
- Always use the same variable name for the same kind of object
- Avoid using the same name to refer to different kinds of things
Suggestions for projects:
- Header comment blocks for every method, every class.
- Document every class instance variable, every method paramater, every result.
- Add comments inside methods if/when needed
- Skip comments only if you're sure it will be obvious to readers
- Follow Javadoc conventions
Red flags for comments:
- Hard to come up with a clear and simple name for variable?
- Method documentation has to document every internal feature of the algorithm in order to be complete?
- Interface documentation for a class or method has to be very long in order to be complete?

CS 190: Software Design Studio (Spring 2015)

Writing Comments