In-Class Workshop after Project 1

Lecture Notes for CS 190
Spring 2015
John Ousterhout

  • Classes tend to be too thin; rarely too thick.
    • Each class introduces complexity
    • Classes may be simple individually, but the relationships between them become complex
    • Thin class example: User.java
    • Information leakage (Router classes example)
    • Temporal decomposition rather than information-hiding
  • Methods also too thin sometimes, especially private ones
    • Makes code behavior non-obvious
    • Every method creates complexity (new API)
  • When to bring things together, when to separate? Bring together if:
    • Shared information
    • Related task: do the whole job in one place
    • Repeated pattern
    • Situations that need to be handled in the same way
    • Benefits of bringing together:
      • Eliminates dependencies
      • Eliminates redundancy
    • Examples:
      • Cache eviction
      • Query parameter parsing
      • JSON escaping
      • Error handling
      • HTTP request get processed twice: one class reads in, puts in a string; then another class parses it.
      • Query parameter parsing: retaining URL escaping in parsed values, unescape only on use.
  • When to separate?
    • When things truly need to be treated differently:
      • Example: using toString method to generate JSON
        • E.g. writing toString methods that generate JSON
        • Depending on existing toString methods (e.g. for arrays) to generate JSON
        • Using method names like listToString in a JSON class, rather than listToJSON
        • Be aware of what's the same and what is different
      • Exceptions that must be handled differently (will discuss later)
    • Multi-purpose (generic) versus single-purpose (application- or feature-specific)
  • Data management abstractions:
    • Should there be a generic storage layer between basic Java I/O and Tweeter-specific functionality?
    • Hard to find a clean and useful in-between abstraction
      • Quite a bit of extra code to implement
      • Only a small benefit to the higher-level code
      • Interfaces often thin or ragged (information leakage to higher-level code)
    • Class-level documentation often not great:
      • What is the abstraction?
      • Under what conditions would I want to use this mechanism?
  • Top-down versus bottom-up design
    • Bottom-up is probably easier for novices, but can lead to thin classes.
    • For cleanest code structure, need a combined approach
  • Common source of complexity: dependencies and restrictions
    • "Current tweet id" managed by Tweet class: other code must recompute this from existing data during startup.
    • HTTP request class invokes a Router static method, but not documented
    • HTTP request processing code has specific dependencies on Tweeter: couldn't be used for any other application without modification
    • Database class creates a FileHandler, whose constructor secretly reads in data and initializes the database (using singleton getInstance).
    • Private methods must be invoked in a specific order (e.g. during HTTP request parsing)
    • Command-line argument processor (separate method) also initializes some of the major data structures.
  • Exception handling issues:
    • Many missing checks:
      • Not enough fields on HTTP initial line
      • Bad integer syntax
    • Not enough information in messages.
    • High complexity:
      • Many catch statements (error checks often repeated)
      • Hard to tell if all errors are handled properly
    • Techniques:
      • Group exceptions into categories according to how they will be handled
      • Catch exceptions in higher-level methods to reduce handler duplication.
  • Documentation issues:
    • Just repeats information from a variable or method name
    • No need to re-document information that's already well-documented elsewhere:
      • The HTTP protocol
      • List of supported URLs and what they do
      • JSON syntax
      • Command-line options
    • Class-level documentation doesn't capture the abstractions very well
    • Things that aren't documented well:
      • File structure on disk
      • Policy for caching information in memory
  • In-class coding exercise: URL decoding