Modular Design

Lecture Notes for CS 190
Spring 2016
John Ousterhout

How to minimize dependencies?
- Modular Design
  - Divide system into modules that are relatively independent
  - Ideal: each module completely independent of the others
    - System complexity = complexity of worst module
  - In reality, modules are not completely independent
    - Some modules must invoke facilities in other modules
    - Design decisions in one module must sometimes be known to other modules
    - Can't change one module without understanding parts of other modules
- Abstraction
  - Minimize dependencies between modules
  - Divide each module into two parts:
    - Interface of a module: anything about that module that must be known to other modules
      - Formal aspects: method signatures, public variables, etc.
      - Informal aspects: side effects, algorithms that affect behavior of methods, etc.
    - Implementation: code that enforces the promises made by the interface
  - Goal for interface design: maximize functionality/(interface complexity) (a sweet interface or module)
How to design sweet interfaces?
Parnas paper
- "On the Criteria To Be Used in Decomposing Systems into Modules"
  - More than 40 years old, some parts dated (e.g. predates classes)
  - Still one of the most important papers in all of systems.
- What is the key idea of this paper?
Information Hiding
Each module (class) encapsulates certain knowledge or design decisions:
The interface does not reflect these design decisions (much)
- Can modify the implementation without impacting other classes
Opposite of information hiding: information leakage
- Implementation details exposed, other classes depend on them
Information hiding works inside a class as well as between classes
- Design methods to encapsulate information

Classes Should be Thick
- Thin class or method:
  - Not much functionality
  - Short methods
  - It takes almost as much code to invoke a method as it would take to just retype the method inline
  - Classic example: linked list
  - Also see User.java
  - Thin classes don't hide much information
- Thick class or method:
  - Lots of functionality, yet simple interface
  - Hides lots of information
Related problem: classitis
- Too many classes
- Bad example: Java libraries
Rule of thumb: 200-2000 lines is a good size for classes
- Below 200 lines: probably pretty thin
- Above 2000 lines: internal complexity of the class can become unmanageable. See if it can be subdivided cleanly.
- However, size itself isn't the most important metric: it's functionality/(interface complexity)
When to bring things together into one class, when to separate?
Bring together if:
- Shared information: when you see information leakage, see if you can pull the code together
- Related task: do the whole job in one place
- Repeated pattern
- Situations that need to be handled in the same way (exception handling)
- Benefits of bringing together:
  - Eliminates dependencies
  - Eliminates redundancy
- Examples from Tweeter project:
  - HTTP request gets processed twice: one class reads it in, puts it in a string; then another class parses it.
  - Query value parsing: retaining URL escaping in parsed values, unescape only on use.
Separate if:
- There isn't much shared information
- Things truly need to be treated differently
- Multi-purpose (generic) versus single-purpose (application- or feature-specific)
API Simplicity
- How to design APIs for a class?
- Decide what's important, design the interface around that
  - Focus on the things that are done most frequently
  - Technique #1: if a particular task is invoked repeatedly, design an API around that task (even better, do it automatically, without having to be invoked).
  - Technique #2: if a collection of tasks are not identical, look for common features shared by all of them; design APIs for the common features.
  - It's OK to provide APIs for infrequently-used features, but design them in a way that you don't need to be aware of them when using the common features.
Bad example: Java I/O
Good example: device-independent I/O in UNIX/Linux:
- Before UNIX: different kernel calls for opening and accessing files vs. devices.
  - Different kernel calls for each device: terminal, tape, etc.
  - Different naming mechanisms for each device
- UNIX emphasized commonality across devices:
  - Devices have names in the file system: special device files
  - All devices have same basic access structure: open, read, write, seek, close
  - Handle device-specific operations with one additional kernel call:
```
int result = ioctl(int fd, int request,
        void* inBuffer, int inputSize,
        void* outBuffer, int outputSize);
```
Pick the most general-purpose/abstract API that meets today's needs
- Example, not "open file" or "open device", just "open"
- Increases the likelihood you can reuse for other purposes

How much to plan ahead?
- "Should I implement extra features beyond those that I need today?
- Design facilities that are general-purpose when possible (but don't get carried away)
- Don't create a lot of specific features that aren't needed now; you can always add them later.
- When you discover that new features or a more general architecture are needed, do it right away: don't hack around it.
- Module writers should embrace suffering:
  - Take on hard problems
  - Solve completely
  - Make solution easy for others to use
  - Take more challenges for yourself, so that others have fewer issues to deal with
- Push complexity down into modules:
  - Let a few module developers suffer, rather than thousands of users
  - Simple APIs are more important than a simple implementation
- Solve, don't punt:
  - Handle error conditions rather than throwing exceptions
  - Minimize "voodoo constants" (configuration parameters)
    - If you don't know the right value, how will a user or administrator ever figure it out?
Are long methods OK?
- Sometimes: see TransportDispatcher.cc (method consists of relatively independent pieces).
- Shorter is generally better, but only decompose if it can be done cleanly (are there dependencies between the parts?).
Applying These Ideas
- May be hard initially to apply these ideas when writing code.
- Make 2 designs and compare
- Pick one and write some code
- Review this topic to look for potential problems
- Revise code
- Take advantage of code reviews
Red flags to look for:
- Information leakage & dependencies
- Thin classes
- Repeated pieces of code (DRY)
- Very deep call stacks (especially if one method simply calls another with essentially the same arguments)
- Lint: little bits of unnecessary complexity

CS 190: Software Design Studio (Spring 2016)

Modular Design