Assignment 2: Synchronization

Assignment 2: Synchronization

This assignment consists of two synchronization problems. The learning goal for this assignment is for you to become comfortable solving synchronization problems using the "monitor" style with locks and condition variables.

Getting Started

To get started on this assignment, login to the myth cluster and paste this command into a console window:

git clone /afs/ir/class/archive/cs/cs111/cs111.1236/repos/assign2/$USER assign2

This will create a new directory assign2 in your current directory and clone a Git starter repository into that directory. Do your work for the assignment in this directory. For this assignment you will modify the files caltrain.hh, caltrain.cc, party.hh, and party.cc. You will need to add additional variables in the header files and fill in method bodies in the .cc files. You can also define additional classes or methods in those files, if needed.

The directory also contains a Makefile; if you type make, it will compile your code for both problems. caltrain.cc will be compiled together with a test program caltrain_test.cc, and party.cc will be compiled together with a test program party_test.cc. You can invoke the command tools/sanitycheck to run a series of basic tests on your solution. Try this now: the starter code should compile but almost all the tests will fail.

Exercise 1: Initializer Lists and Destructors

As part of this assignment, you will learn about (or review) C++ initializer lists and destructors. The file destruct.cc contains a simple class Printer that illustrates both of these features. In section, you will go over this program with your CA and discuss its behavior when it runs. After you have done that, answer Question 1 in questions.txt.

Exercise 2: Bridge Crossing

(You will go through the solution to this problem in section with your CA; we have provided a solution in the files bridge.hh and bridge.cc. There is nothing for you to turn in for this problem)

You have been hired by Caltrans to automate the flow of traffic on a one-lane bridge that has been the site of numerous collisions. Caltrans wants you to implement the following rules:

  • Traffic can flow in only a single direction on the bridge at a time.
  • Any number of cars can be on the bridge at the same time, as long as they are all traveling in the same direction.
  • To avoid starvation, you must implement the "five car rule": once 5 or more cars have entered the bridge in the eastbound direction, if there are any westbound cars waiting, then no more eastbound cars may enter the bridge until some westbound cars have crossed. A similar rule also applies once 5 or more consecutive westbound cars have entered the bridge.

You must implement the traffic flow mechanism in C++ by defining a class Bridge, which implements four methods in addition to its constructor.

When an eastbound car arrives at the bridge, it invokes the method

    arrive_eb()

This method must not return until it is safe for the car to cross the bridge, according to the rules above. Once an eastbound car has finished crossing the bridge it will invoke the function

    leave_eb()

Westbound cars will invoke analogous functions arrive_wb and leave_wb.

Exercise 3: Caltrain Automation

Caltrain has decided to improve its efficiency by automating not just its trains but also its passengers. From now on, passengers will be robots. Each robot and each train is controlled by a thread. You have been hired to implement synchronization that will guarantee orderly loading of trains. You must define a Station class that provides three methods:

  • When a train arrives in the station and has opened its doors, it invokes the method
    load_train(int available)
    
    where available indicates how many seats are available on the train. The method must not return until the train is satisfactorily loaded (all passengers are in their seats, and either the train is full or all waiting passengers have boarded).
  • When a passenger robot arrives in a station, it first invokes the method
    wait_for_train()
    
    This method must not return until a train is in the station (i.e., a call to load_train is in progress) and there are enough free seats on the train for this passenger to sit down. Once this method returns, the passenger robot will begin moving the passenger on board the train and into a seat (you do not need to worry about how this mechanism works).
  • Once a passenger is seated, it will call the method
    boarded()
    
    to let the station know that it's safely on board. This method is needed because it takes time for passengers to board the train once wait_for_train has given them the go-ahead, and the train mustn't leave the station until all of the admitted passengers are safely in their seats. This method never blocks.

Here is some additional information for this problem:

  • You do not have to write code to create threads; we will provide a test harness that creates Station objects, plus threads to represent passengers and trains, and then invokes the methods that you write. At any given time, there may exist any number of Station objects, passenger threads, and train threads (subject to the restrictions below).
  • Your code should not invoke any of the above methods (they will be invoked by the test harness).
  • You may assume that there is never more than one train in a given Station at once, and that all trains (and all passengers) are going to the same destination. Any passenger can board any train.
  • Your code must allow multiple passengers to board simultaneously (it must be possible for several passengers to have called wait_for_train, and for that method to have returned for each of the passengers, before any of the passengers calls boarded).
  • It must be possible to have multiple Station objects operating independently (activities in one Station should not affect any other Station).
  • Our solution for each method is < 10 lines long each (excluding comments). Your methods don't necessarily have to be this short, but if they are significantly longer then you're probably taking a more complicated approach than you need.

How to Write Synchronization Code

The best way to get started on synchronization problems like these is to think about what information about the state of the world needs to be kept in a Station object. Consider what information is needed to implement the methods. For example:

  • What information is needed in order to decide whether a passenger can safely board a train?
  • What information is needed to decide when a train can leave the station?

We call these variables state variables because they keep track of the state of the world, such as how many passengers are currently waiting. Once you have an initial guess about the state to maintain, you can then try writing the methods. For starters, add enough code to maintain proper values for all of the state variables. Then you can write code that uses those values to determine when methods can safely return. As you write and test the code, you may discover that your state variables don't have enough information to make important decisions; when that happens, you'll need to add more state variables.

Recommended Steps

  1. Write code to maintain the initial set of state variables you decided on; add comments marking the places where the code may need to block or wakeup other threads.
  2. Next, ignore all of the state variables and add just enough code for an arriving passenger to block in wait_for_train until a train has arrived. When a train arrives, all blocked calls to wait_for_train should return, and load_train should return immediately. Ignore the boarding process and the boarded method. This isn't correct, of course, but it should allow your code to pass sanity check tests #2 and #3.
  3. Now write enough code for load_train to block correctly: as long as passengers are boarding or waiting to board, the train should stay in the station, leaving only when all boarding passengers have invoked boarded. Continue to ignore train capacity: as long as a train is in the station, passengers can begin boarding. This should pass sanity check tests #4, #5, and #6.
  4. Now factor in train capacity; even if passengers are waiting, a train should leave if it has no more seats. Once a train leaves the station, even if it had open seats, no passengers should board until another train with space arrives. And even if a train is in the station, a passenger shouldn't board if there are no more seats. You should now pass sanity check tests #7 and #8; woohoo!

Testing

When you're debugging you'll find it useful to run tests individually. The output of sanitycheck shows the command to run for each test; for example,

./caltrain_test no_waiting_passengers

will run the first test, in which trains arrive when there are no passengers waiting. Try invoking this command: it prints out a series of messages indicating what the test is doing and any errors it encounters. You can compare this to the output of the sample solution in samples/caltrain_test_soln. When you're debugging, you'll also find it useful to look at the code in the test harness so you can see exactly what is going on. For example, the code for the no_waiting_passengers test is in the method no_waiting_passengers in caltrain_test.cc; each test is implemented by a method with the same name as the test.

Once you have passed all of the sanity checks for this exercise, you should also test your Caltrain solution by invoking the following command:

./caltrain_test random 1000 50

This will run a randomized test where 1000 passengers arrive at the station, followed by a series of trains with different capacities (there will be at most 50 free seats in any one train). Check the output manually to make sure things look correct. Try running this test many times to make sure it always works (see Debugging Irreproducible Failures below). Try varying the parameters as well.

Note: the random test creates a large number of threads; depending on other activity on the machine, it's possible that the system may run out of threads. If this happens, the test will fail with an error message that says "Resource temporarily unavailable". When this happens, try again. If it fails repeatedly, you can either reduce the number of passengers or log out and log in to a different myth machine.

We do not guarantee that the tests we have provided are exhaustive, so passing all of the tests is not necessarily sufficient to ensure a perfect score (CAs may discover other problems in reading through your code).

Debugging Irreproducible Failures

Bugs in synchronization code often result in failures that only occur occasionally; most of you will experience problems like this at some point during this assignment. The easiest way to debug these problems is with tracing. Add statements to print out information each time a lock is acquired, before and after each condition variable wait, and after any other major decisions are made. Print out lots of information in each of these statements, including where in the code you are, which thread is running (if you can tell), and key state variables. Then pick a test that occasionally fails and run it repeatedly until it does fail (pipe the output to a file). If you walk through the log file, you will probably find a sequence of events that you hadn't expected to occur, and which your code can't handle properly.

Exercise 4: Party Introductions

You have been hired by Stanford's Greek Houses to help manage parties at fraternities and sororities. In particular, you must create a C++ class Party with a single method that can be used to introduce incoming guests to each other. When a guest arrives at a party, they will invoke the following method on a Party object:

std::string meet(std::string &my_name, int my_sign, int other_sign);

The my_name argument gives the name for this guest and my_sign indicates the guest's Zodiac sign. The other_sign argument indicates that this guest would like to meet someone with that Zodiac sign. The meet method must not return until a guest has arrived with matching my_sign and other_sign, and meet must return the name of the matching guest. In addition:

  • As with Caltrain, you do not need to write code to create threads, construct Party objects, or invoke the meet method: the test framework code in party_test.cc will do that for you. All you have to do is complete the files party.hh and party.cc.
  • Guests must be matched: if A receive's B's name, then B must receive A's name.
  • If a suitable match is not immediately available, then meet must wait until a match becomes available. In the meantime, it must be possible for non-interfering matches to occur.
  • Zodiac signs are represented with integers in the range 0-11, inclusive, which correspond to the birth months January-December. You may assume that callers never provide values outside the range 0-11.
  • It is possible for multiple guests to have the same name.
  • It must be possible to have multiple Party objects all operating independently (activities in one Party should not affect any other Party).
  • Guests must be matched fairly: if there are several waiting guests that match a new arrival, the one that has been waiting longest should generally be chosen. See the section below about FIFO behavior for locks and condition variables.
  • tools/sanitycheck will run a series of tests on your solution.
  • Once all of the sanity check tests pass, you should also test your party solution by invoking the following command:
    ./party_test random_party 100 4
    
    This runs a randomized test with a large number of party guests with different preferences. The first parameter (100) indicates how many guests will arrive, and the second parameter indicates how many distinct signs should appear among all the arrivals (4 means that only 4 of the numbers between 1 and 12 will be used as my_sign or other_sign; a smaller value like 4 is more likely to produce conflicts that expose bugs). Check the output manually to make sure there are no error messages. Try running this test many times (and also try varying the parameters), in order to generate a variety of scenarios.

For this exercise we aren't going to give you hints about how to break the implementation down into steps. Use what you learned while implementing the Caltrain solution to devise a plan for this exercise.

Are Locks and Condition Variables FIFO?

The C++ classes for locks and condition variables are mostly FIFO (First In First Out: the first thread to wait is the first thread to wakeup) but not perfectly so. If you're curious to see this, try running

./party_test cond_fifo

This runs a test to see if condition variables are perfectly FIFO. If you run it several times, you'll see that it sometimes fails (the first thread to wait on a condition variable doesn't necessarily get notified first). For your assignments, you can assume that this behavior is close enough to FIFO for fairness considerations (e.g. this is good enough for the "fair matching" requirement for Party).

However, your code must not depend on perfect FIFO behavior for correctness: for example, it is not safe to assume that when you notify a condition variable, the first thread on the queue is guaranteed to be the first thread to acquire the lock. Even if condition variables were perfectly FIFO, if two threads are notified at about the same time, the scheduler might choose to run them in any order, and this will affect which one gets the lock first.

Complexity

Synchronization code is hard to get right, so it's important for it to be clean, simple, and obvious. Unfortunately, submissions for this assignment often end up long and complicated; such solutions rarely work, and in real life they would be brittle and hard to maintain. Thus the CAs will consider code complexity as a major factor in your style score for this assignment, and we will reduce your project score by up to 20% if there are significant complexity and/or other style issues. Our solution for caltrain.cc has 49 lines and our solution for party.c has 38 lines, not including blank lines and comments. Note: your goal should be simplicity, not just line count; simple programs are usually shorter than complex ones, but the shortest program isn't always the simplest.

Other Requirements

  • Your solutions for both problems must be written in C++.
  • You must use the C++ library classes for synchronization. One way to do this is with the classes mutex, unique_lock, and condition_variable. Another set that is functionally equivalent for this assignment is mutex, lock_guard, and condition_variable_any. Do not invoke the lock and unlock methods of a mutex directly; use guards such as unique_lock or lock_guard, as discussed in section.
  • You may use other STL container classes (vector, etc.) in your solutions, but you may not use any classes that simplify the synchronization issues. If you have any doubt about whether a class is acceptable or not, please check with the course staff.
  • There must be no busy-waiting in your solutions.
  • You must use the monitor design pattern for synchronization, as discussed in lecture. In particular, you may not use more than one mutex per Station or Party object.
  • You may use more than one condition variable for each Station or Party object. Code using one condition variable to wake up many different threads can sometimes be improved by using multiple condition variables. This way only threads you really want to wake get woken up.
  • You may assume that there are no spurious wakeups on condition variables. If you haven't heard the term "spurious wakeup" before, don't worry; you can just ignore this comment.

Submitting

Once you are finished working and have saved all your changes, submit by running tools/submit. Make sure that you have answered the question in questions.txt before submitting.

We recommend you do a trial submission in advance of the deadline to allow time to work through any snags. You may submit as many times as you like; we will grade the latest submission. Submitting a stable but unpolished/unfinished version is like an insurance policy. If the unexpected happens and you miss the deadline to submit your final version, the earlier submit will earn points. Without a submission, we cannot grade your work. You can confirm the timestamp of your latest submission in your course gradebook.

Grading

Here is a recap of the work that will be graded on this assignment:

  • questions.txt: answer the question.
  • caltrain.hh and caltrain.cc: flesh out the Station class.
  • party.hh and party.cc: flesh out the Party class.

We will grade your code using the provided sanity check tests and possible additional autograder tests. We will also review your code for style and complexity. Check out our course style guide for tips and guidelines for writing code with good style!