Raft Project 1 Review/Discussion

Lecture Notes for CS 190
Winter 2018
John Ousterhout

  • Not many hefty deep classes
  • Biggest overall challenge: when to bring together, when to keep apart?
    • Related things should be located near each other (e.g., same class)
    • Unrelated things should be kept apart
  • Reasons to bring together:
    • Shared information
    • Duplicated code
    • Can simplify the interface
  • Reasons to pull apart:
    • Unrelated information
    • Separate general-purpose code from special-purpose uses.
  • Example: you know how a class will be used, so info specific to that usage gets embedded in the class. Result: class over-specialized, higher cognitive load for readers.
  • Just because you know something doesn't mean you should embed that knowledge in every class!
  • Rule of thumb: separate general-purpose code from special-purpose code; avoid putting specialized info in classes
  • Example: tried to divide state machine up into multiple classes, but this didn't work well
  • Rule of thumb: if two pieces of code access the same information, bring them together
  • Rule of thumb: try to do each logical task in one place; don't split up parts of a task unless they are relatively independent.
  • Red flags for things that should be brought together: information leakage, flipping back and forth, code duplication.
  • Pull related things together, even within a class (Examples 2, 3, 4)

Threads and Synchronization

  • Why are threads needed?
  • What will limit performance?
  • Will the choice of synchronization mechanism have a big impact on performance?
  • Other synchronization notes:
    • Keep it as simple as possible
    • Coarse-grain is best, unless its performance is intolerable
    • Use simple monitor-style
      • Lock on object
      • Acquire lock on method entry, release on exit
    • Finer-grain locks are very hard to get right (Example T1)
    • Find a way to document your overall synchronization strategy (no good place? See Example T2)
    • Persistence has issues similar to synchronization: unsafe to persist term and vote separately.
  • Timers added a lot of complexity
    • Starting and stopping is verbose
    • Separate election and heartbeat timers also adds complexity
    • Hard to keep track of which is running (state isn't obvious)
  • Alternative approach: don't start and stop timers
    • One repeating timer
    • Pick frequency that is a fraction of election timeout (~1/10th?)
    • Keep state variables:
      • Last time AppendEntries requests were sent
      • Last time we heard from a valid leader
    • When timer fires, check state variables to see whether time-related actions need to happen
    • Eliminates various race conditions that happen with timers.
    • Or don't even have timer; just specify time limit on waiting operations, such as selector.select.

Exceptions

  • In general, hard to convince myself that exception handling was correct.
    • Problem: code split. Throw in one place, catch in another.
  • Have a plan
  • First think about how to handle (categories?)
    • State file doesn't exist
    • State file cannot be read or parsed
    • Can't write persistent state (Example E1)
    • Incoming message is malformed
    • I/O error receiving request
    • I/O error sending response
    • I/O error in outbound communication (request or response)
    • Can't open socket to listen on
  • The same exception (IOException) may have to be handled differently in different situations
  • Report exceptions in terms that make sense to handler
    • Example: disk error writing persistent state: IOException?
    • Ask what is the right abstraction
    • May need to define new exception classes

Documentation

  • Most projects not bad overall, but still issues: see Examples D1-D6

Issues Related to Raft

  • Most groups implemented messages, not RPCs
    • A remote procedure call consists of paired messages: request and response
    • Response is paired with request, typically synchronous (wait for respones)
    • RPC system will typically retry if no response received
    • With message-based approach, each message independent, no obvious association between requests and responses.
  • For Raft servers, don't really need RPC
    • Can process response without knowing which request it came from
    • No need for retry at this level: higher-level timers for elections and heartbeats are sufficient
    • Messages are simpler to implement (especially if you can create a new socket for each one!)
    • But, what will you do for client communication in Project 2?
  • Lots of corner cases that weren't considered:
    • Only keep count of votes, rather than list of servers?
    • Trying to do I/O without blocking:
      • Only part of a message arrives
      • Immediately wait on new socket for message: will block if sender didn't send a message.
      • Can block on writing as well as reading, if socket backs up.
    • Barrier-style broadcasts.

Odds and Ends

  • Avoid Pairs (Example M1): expedient but poor abstraction, obscure.
    • Define small container class instead
  • Getters and setters: when is it better to make variables public?
    • If every instance variable has a getter.
    • If class is shallow and likely to stay that way.
  • ByteBuffer: interface both shallow and obscure (Example M2)
    • Constantly have to read the documentation?
    • Essentially no information hiding