Writing Comments

Lecture Notes for CS 190
Spring 2015
John Ousterhout

  • Code alone can't represent cleanly all the information needed to use and maintain a module
  • In other cases cases, information could be deduced from code, but that would be time-consuming:
  • In-code comments provide this extra information; they are an essential tool in managing complexity
  • Comments make software appear simpler:
    • Describe the (simple) interface to a module whose implementation is complicated
  • Comments are still controversial (!)
    • A significant fraction of all commercial code (50%?) is uncommented
    • Excuses:
      • "This code is self-documenting"
      • "Comments get out of date and become misleading"
      • "I don't have time to write comments"
      • "The comments I have seen are worthless; why bother?"
  • Kinds of comments:
    • Interface documentation (e.g. classes and methods)
    • Individual variables: instance variables for a class, method parameters, method results, local variables
    • Implementation documentation (e.g. comments inside a method)
    • Cross-module design decisions (e.g. common protocols)
  • Comments should describe things that are not obvious from the code.
    • Mistake #1: comments duplicate code (see slides)
    • Mistake #2: non-obvious info is not described
      • Low-level details:
        • Units for a variable
        • Invariants
        • Special cases (what does null mean?)
      • Higher-level information:
        • Rationale for the current design: why the code is this way.
        • How to choose the value of a configuration parameter.
        • Abstractions: a higher-level, more intuitive, description of what the code is doing.
  • Two kinds of documentation for classes and methods:
    • Interface: what someone needs to know in order to use this class or method
    • Implementation: how the method or class works internally to implement the advertised interface.
    • Important to separate these: do not describe the implementation in the interface documentation!
  • Interface documentation:
    • Put immediately before the class or method declaration
    • Goal: create simple, intuitive model for users
    • Simpler is better
    • Complete: must include everything that any user might need to know
    • Interface description may use totally different terms than the implementation (if they are simpler)
  • Example: index range lookup
    • Large table of objects for a Web site, split across dozens of servers
    • Table has indexes for looking up objects by certain attributes (name, salary, etc.)
    • IndexLookup class retrieves a range of values in index order
      query = new IndexLookup(index, key1, key2);
      while (true) {
          object = query.getNext();
          if (object == null) {
              break;
          }
          ...
      }
      
  • Documentation for variables and return values: be very specific
    • Exactly what is this thing?
    • What are the units?
    • Boundary conditions
      • Does "end" refer to the last value, or the value after the last one?
      • Is a null pointer value allowed? If so, what does it mean?
    • If memory is dynamically allocated, who is responsible for freeing it?
    • Invariants?
  • Implementation documentation (comments inside methods):
    • For many methods, not needed
    • For longer methods, document major blocks of code
      • Describe what's happening at a higher level
      • E.g., what does each loop iteration do?
    • Documenting variables is less important for method local variables (can see all of the uses), but sometimes needed for longer methods or tricky variables.
  • Documenting cross-module design decisions:
    • Example: network protocol
    • Example: how does the system deal with zombie servers?
    • Challenging:
      • No single rational place to put the documentation (people won't know where to look for it)
      • Don't want to repeat everywhere
    • One possible approach:
      • Create designNotes file, with various tagged sections
        • "Zombies"
        • "Timing-dependent tests"
      • In the code, just refer to the design notes file:
        // See "Zombies" in designNotes.
        
  • To maximize value of comments:
    • It must be easy for people to find the right documentation at the right time
    • The documentation must get updated as the code changes
  • Techniques:
    • Document each thing exactly once: don't duplicate documentation (it won't get maintained)
      • Use references rather than repeating documentation: "See documentation for xyz method".
    • Put documentation as close as possible to the relevant code
      • Next to variable and method declarations
      • Push in-method documentation down to the tightest enclosing context
    • Don't say anything more in documentation than you need to
      • e.g., don't use comments in one place to describe design decisions elsewhere
      • Higher-level comments are less likely to become obsolete
    • Look for "obvious" locations where people can easily find documentation (see Status example in slides)
  • Best comments are those written at the beginning:
    • I write class comments, method headers (signature and comments) before writing the bodies of methods
    • Helps me to define the overall APIs, juggle functionality between methods
    • As I write and test code, can revise the comments to make them better and better
  • Counter-arguments:
    • "Why waste time writing comments when the code is still changing?"
    • "Once I get the code done, I promise to write all the comments"
    • Problem #1: you won't go back and write the comments later
    • Problem #2: if you do, the comments will be bad:
      • You're in a hurry and emotionally checked out
      • You have forgotten many of the design decisions, subtleties
      • Comments will be a superficial duplication of what's obvious from the code
  • Name choice is an important form of documentation
    • Take time to think of names that are clear and unambiguous
    • Be specific (but not too long)!
    • Always use the same variable name for the same kind of object
    • Avoid using the same name to refer to different kinds of things
  • Suggestions for projects:
    • Header comment blocks for every method, every class.
    • Document every class instance variable, every method paramater, every result.
    • Add comments inside methods if/when needed
    • Skip comments only if you're sure it will be obvious to readers
    • Follow Javadoc conventions
  • Red flags for comments:
    • Hard to come up with a clear and simple name for variable?
    • Method documentation has to document every internal feature of the algorithm in order to be complete?
    • Interface documentation for a class or method has to be very long in order to be complete?