Raft Project 2 Review/Discussion (Winter 2022)
Everyone made improvements from Project 1.
- Deeper classes (simpler APIs, less specialization & information leakage)
- Network communication pretty good in every project
- Better error detection and logging (but still more work to do)
Exceptions
How many new exception types to define?
- Just use
std::runtime_error
?- Too generic (useful to distinguish exceptions into categories that might be useful when catching)
- Define new types
RecoverableException
andFatalException
?- Still too broad
- Fatal vs. recoverable is determined by the catcher, not the thrower
- Instead, define a few class-specific exception types:
StorageFailure
, orNetworkError
- Start with 1-2 per major class? Can add more later if needed.
- How to define exceptions? See slides.
- Must document exceptions in the interfaces
Class Design: Together vs. Apart
Given various pieces of functionality, which belong together in the same class/method and which should be separated in different classes/methods?
Key considerations:
- Separate general-purpose and special-purpose code
- Over-specialization creates information leakage
- Combine things that are related, separate things that are not related
- Do one thing at a time
- Do the whole job in one place
Examples:
- Raft server contains state machine for shell?
- Client main program also has code to communicate with Raft cluster?
- Communication libraries for server-server and client-server communication?
- Raft server also manages communication with clients?
- Log class also manages other persistent state such as term and vote?
- Separate code for sending heartbeats and AppendEntries requests (no log entries in heartbeats)?
Performance Issues
Log overheads that scale with size of the log:
- Scan entire log at startup
- Keep index for entire log in memory
- Rewrite entire index for each modification
- These shouldn't be necessary: only the most recent entries are likely to be accessed.
Lots of copying of messages
How to approach performance in general:
- Learn what operations are unusually expensive
- Design to avoid these big problems
- For everything else, design for simplicity.
Smaller Stuff
Name of parameter that determines whether to reset persistent state:
notExistOK
?- When something is important, the name should convey that.
- Also doesn't have the right semantics.
Distinct methods for opening and closing connections:
- Not needed: can do automatically when sending messages
- "Just do the right thing"