Raft Project 1 Review/Discussion
Lecture Notes for CS 190
Winter 2018
John Ousterhout
- Not many hefty deep classes
- Biggest overall challenge: when to bring together, when to keep
apart?
- Related things should be located near each other (e.g., same class)
- Unrelated things should be kept apart
- Reasons to bring together:
- Shared information
- Duplicated code
- Can simplify the interface
- Reasons to pull apart:
- Unrelated information
- Separate general-purpose code from special-purpose uses.
- Example: you know how a class will be used, so info specific to
that usage gets embedded in the class. Result: class over-specialized,
higher cognitive load for readers.
- Just because you know something doesn't mean you should embed that
knowledge in every class!
- Rule of thumb: separate general-purpose code from special-purpose
code; avoid putting specialized info in classes
- Example: tried to divide state machine up into multiple classes,
but this didn't work well
- Rule of thumb: if two pieces of code access the same information,
bring them together
- Rule of thumb: try to do each logical task in one place; don't
split up parts of a task unless they are relatively independent.
- Red flags for things that should be brought together: information
leakage, flipping back and forth, code duplication.
- Pull related things together, even within a class (Examples 2, 3, 4)
Threads and Synchronization
- Why are threads needed?
- What will limit performance?
- Will the choice of synchronization mechanism have a big
impact on performance?
- Other synchronization notes:
- Keep it as simple as possible
- Coarse-grain is best, unless its performance is intolerable
- Use simple monitor-style
- Lock on object
- Acquire lock on method entry, release on exit
- Finer-grain locks are very hard to get right (Example T1)
- Find a way to document your overall synchronization strategy
(no good place? See Example T2)
- Persistence has issues similar to synchronization:
unsafe to persist term and vote separately.
- Timers added a lot of complexity
- Starting and stopping is verbose
- Separate election and heartbeat timers also adds complexity
- Hard to keep track of which is running (state isn't obvious)
- Alternative approach: don't start and stop timers
- One repeating timer
- Pick frequency that is a fraction of election timeout (~1/10th?)
- Keep state variables:
- Last time AppendEntries requests were sent
- Last time we heard from a valid leader
- When timer fires, check state variables to see whether time-related
actions need to happen
- Eliminates various race conditions that happen with timers.
- Or don't even have timer; just specify time limit on waiting
operations, such as selector.select.
Exceptions
- In general, hard to convince myself that exception handling
was correct.
- Problem: code split. Throw in one place, catch in another.
- Have a plan
- First think about how to handle (categories?)
- State file doesn't exist
- State file cannot be read or parsed
- Can't write persistent state (Example E1)
- Incoming message is malformed
- I/O error receiving request
- I/O error sending response
- I/O error in outbound communication (request or response)
- Can't open socket to listen on
- The same exception (IOException) may have to be handled differently
in different situations
- Report exceptions in terms that make sense to handler
- Example: disk error writing persistent state: IOException?
- Ask what is the right abstraction
- May need to define new exception classes
Documentation
- Most projects not bad overall, but still issues: see Examples D1-D6
Issues Related to Raft
- Most groups implemented messages, not RPCs
- A remote procedure call consists of paired messages: request
and response
- Response is paired with request, typically synchronous (wait
for respones)
- RPC system will typically retry if no response received
- With message-based approach, each message independent, no
obvious association between requests and responses.
- For Raft servers, don't really need RPC
- Can process response without knowing which request it came from
- No need for retry at this level: higher-level timers for elections
and heartbeats are sufficient
- Messages are simpler to implement (especially if you can create
a new socket for each one!)
- But, what will you do for client communication in Project 2?
- Lots of corner cases that weren't considered:
- Only keep count of votes, rather than list of servers?
- Trying to do I/O without blocking:
- Only part of a message arrives
- Immediately wait on new socket for message: will block
if sender didn't send a message.
- Can block on writing as well as reading, if socket backs up.
- Barrier-style broadcasts.
Odds and Ends
- Avoid Pairs (Example M1): expedient but poor abstraction, obscure.
- Define small container class instead
- Getters and setters: when is it better to make variables public?
- If every instance variable has a getter.
- If class is shallow and likely to stay that way.
- ByteBuffer: interface both shallow and obscure (Example M2)
- Constantly have to read the documentation?
- Essentially no information hiding