Modular Design

Lecture Notes for CS 190
Spring 2016
John Ousterhout

  • How to minimize dependencies?
    • Modular Design
      • Divide system into modules that are relatively independent
      • Ideal: each module completely independent of the others
        • System complexity = complexity of worst module
      • In reality, modules are not completely independent
        • Some modules must invoke facilities in other modules
        • Design decisions in one module must sometimes be known to other modules
        • Can't change one module without understanding parts of other modules
    • Abstraction
      • Minimize dependencies between modules
      • Divide each module into two parts:
        • Interface of a module: anything about that module that must be known to other modules
          • Formal aspects: method signatures, public variables, etc.
          • Informal aspects: side effects, algorithms that affect behavior of methods, etc.
        • Implementation: code that enforces the promises made by the interface
      • Goal for interface design: maximize functionality/(interface complexity) (a sweet interface or module)
  • How to design sweet interfaces?
  • Parnas paper
  • Information Hiding
  • Each module (class) encapsulates certain knowledge or design decisions:
  • The interface does not reflect these design decisions (much)
    • Can modify the implementation without impacting other classes
  • Opposite of information hiding: information leakage
    • Implementation details exposed, other classes depend on them
  • Information hiding works inside a class as well as between classes
    • Design methods to encapsulate information
  • Classes Should be Thick
    • Thin class or method:
      • Not much functionality
      • Short methods
      • It takes almost as much code to invoke a method as it would take to just retype the method inline
      • Classic example: linked list
      • Also see User.java
      • Thin classes don't hide much information
    • Thick class or method:
      • Lots of functionality, yet simple interface
      • Hides lots of information
  • Related problem: classitis
    • Too many classes
    • Bad example: Java libraries
  • Rule of thumb: 200-2000 lines is a good size for classes
    • Below 200 lines: probably pretty thin
    • Above 2000 lines: internal complexity of the class can become unmanageable. See if it can be subdivided cleanly.
    • However, size itself isn't the most important metric: it's functionality/(interface complexity)
  • When to bring things together into one class, when to separate?
  • Bring together if:
    • Shared information: when you see information leakage, see if you can pull the code together
    • Related task: do the whole job in one place
    • Repeated pattern
    • Situations that need to be handled in the same way (exception handling)
    • Benefits of bringing together:
      • Eliminates dependencies
      • Eliminates redundancy
    • Examples from Tweeter project:
      • HTTP request gets processed twice: one class reads it in, puts it in a string; then another class parses it.
      • Query value parsing: retaining URL escaping in parsed values, unescape only on use.
  • Separate if:
    • There isn't much shared information
    • Things truly need to be treated differently
    • Multi-purpose (generic) versus single-purpose (application- or feature-specific)
  • API Simplicity
    • How to design APIs for a class?
    • Decide what's important, design the interface around that
      • Focus on the things that are done most frequently
      • Technique #1: if a particular task is invoked repeatedly, design an API around that task (even better, do it automatically, without having to be invoked).
      • Technique #2: if a collection of tasks are not identical, look for common features shared by all of them; design APIs for the common features.
      • It's OK to provide APIs for infrequently-used features, but design them in a way that you don't need to be aware of them when using the common features.
  • Bad example: Java I/O
  • Good example: device-independent I/O in UNIX/Linux:
    • Before UNIX: different kernel calls for opening and accessing files vs. devices.
      • Different kernel calls for each device: terminal, tape, etc.
      • Different naming mechanisms for each device
    • UNIX emphasized commonality across devices:
      • Devices have names in the file system: special device files
      • All devices have same basic access structure: open, read, write, seek, close
      • Handle device-specific operations with one additional kernel call:
        int result = ioctl(int fd, int request,
                void* inBuffer, int inputSize,
                void* outBuffer, int outputSize);
        
  • Pick the most general-purpose/abstract API that meets today's needs
    • Example, not "open file" or "open device", just "open"
    • Increases the likelihood you can reuse for other purposes
  • How much to plan ahead?
    • "Should I implement extra features beyond those that I need today?
    • Design facilities that are general-purpose when possible (but don't get carried away)
    • Don't create a lot of specific features that aren't needed now; you can always add them later.
    • When you discover that new features or a more general architecture are needed, do it right away: don't hack around it.
    • Module writers should embrace suffering:
      • Take on hard problems
      • Solve completely
      • Make solution easy for others to use
      • Take more challenges for yourself, so that others have fewer issues to deal with
    • Push complexity down into modules:
      • Let a few module developers suffer, rather than thousands of users
      • Simple APIs are more important than a simple implementation
    • Solve, don't punt:
      • Handle error conditions rather than throwing exceptions
      • Minimize "voodoo constants" (configuration parameters)
        • If you don't know the right value, how will a user or administrator ever figure it out?
  • Are long methods OK?
    • Sometimes: see TransportDispatcher.cc (method consists of relatively independent pieces).
    • Shorter is generally better, but only decompose if it can be done cleanly (are there dependencies between the parts?).
  • Applying These Ideas
    • May be hard initially to apply these ideas when writing code.
    • Make 2 designs and compare
    • Pick one and write some code
    • Review this topic to look for potential problems
    • Revise code
    • Take advantage of code reviews
  • Red flags to look for:
    • Information leakage & dependencies
    • Thin classes
    • Repeated pieces of code (DRY)
    • Very deep call stacks (especially if one method simply calls another with essentially the same arguments)
    • Lint: little bits of unnecessary complexity