Assignment 1: Lambdas, Threads, and Processes

This assignment consists of a series of exercises to help you explore three constructs for managing the execution of code: lambdas, threads, and processes. You will also have a chance to experiment with atomic operations in various forms. Here are the learning goals for the assignment:

Become comfortable using lambdas to create unnamed functions, and using captures to pass additional information to lambdas.
Learn how to create new threads, either in the same process or in a new process. Understand how these approaches are similar and different.
Experiment with atomic operations to see how they impact the behavior of concurrent systems and how they can be used to enforce ordering between concurrent threads.

Getting Started

You will do all of your work for this assignment on the myth cluster. To get started, login to a myth machine and invoke the following command:

git clone /afs/ir/class/archive/cs/cs111/cs111.1236/repos/assign1/$USER assign1

This will create a new directory assign1 in your current directory and it will clone a Git starter repository into that directory. You will do your work for the assignment in the new directory; along the way, you will modify several of the files. For example, there is a file questions.txt that contains several questions for you to answer; as you work through the exercises, you will add your answers into this file. You can use Git to manage your revisions to the assignment; we recommend that you make frequent commits by invoking the command git commit -a to make sure you don't accidentally lose work.

Exercise 1: Lambdas

(You will go through this exercise in section with your CA)

Over the course of the quarter you will need to use several C++ features that you may not have seen before. The assignments will introduce you to these features as they become needed. In this assignment you will learn about C++ lambdas: you'll use them when creating new threads.

A lambda is an anonymous C++ function (one without a name) that you can declare anywhere, even in the middle of other code. A lambda has a body just like a regular C++ function, it takes arguments just like a function, and it returns a result just like a function; the only thing it is missing is a name. Lambdas also have the interesting property that they can access values from the scope in which they are declared; this is called a capture. Lambdas are typically used as a way of passing small bits of code as arguments to methods.

Let's start with an example that does not use lambdas. Take a look at the file nearest.cc. This program creates an array of integers, sorts the array using the C++ function std::sort, then prints the sorted array. Type make to compile this program, then run it with the command

./nearest

As you can see from the output, the program doesn't sort the values in the normal way (in order of size). Instead, it sorts the values by how close they are to the number 50, with the closest value appearing first. It does as this by providing a special-purpose comparison function (compare) as an argument to std::sort. The comparison function takes two values and returns a boolean indicating whether the first value should be considered "less than" the second. Given this comparison function, std::sort does all the rest of the work for sorting. As you can see, compare returns a boolean indicating whether its first argument is closer to 50 than the second.

Now answer Question 1A in questions.txt.

Next, look over the file nearest2.cc. This program is identical to nearest.cc except that it uses a lambda instead of the compare function. The lambda is the third argument to std::sort, consisting of the following:

[] (int val1, int val2) -> bool {
    int dist1 = abs(val1 - 50);
    int dist2 = abs(val2 - 50);
    return dist1 < dist2;
}

This looks a lot like the definition of compare in nearest.cc, except for the absence of a function name and the presence of []. The construct "-> bool" indicates the return type of the function. This lambda defines a function that takes two arguments and returns a boolean, just like compare. Try running nearest2 to convince yourself that it produces the same output as nearest.

Now take a look at the file nearest3.cc; this program uses a lambda with a capture. The behavior of nearest3.cc is similar to nearest2.cc, except that the target value is now a command-line parameter. In nearest and nearest2, the target was hard-wired as 50, and the array was sorted by how close to 50 each element is. With nearest3.cc you pass the target value as a command-line option. Try running nearest3 with a few different targets:

./nearest3 70
./nearest3 100
./nearest3 50

The last example should behave identically to nearest2. In nearest3.cc the lambda looks like this:

[target] (int val1, int val2) -> bool {
    int dist1 = abs(val1 - target);
    int dist2 = abs(val2 - target);
    return dist1 < dist2;
}

The construct [target] is a capture: it makes a copy of the target value from the enclosing code and allows the code in the lambda to access it under the name target. The capture mechanism is extremely convenient: it effectively allows extra parameters to be passed to the lambda without them being visible to std::sort. In general, the capture can include more than one value, separated by parentheses. In nearest2.cc no variables were captured, hence the [].

You may be wondering why captures are needed. Instead, why not pass target as an additional argument to std::sort, and then have std::sort pass it on to the lambda as an argument? Unfortunately, this would require modifying std::sort to handle an extra argument. And, different uses of std::sort might need to pass different values (or different numbers of values) to their lambdas. The capture mechanism allows the values to be passed from the enclosing method (main) directly to the lambda without std::sort knowing anything about them. That is much cleaner and more general.

Now answer Question 1B in questions.txt.

This is a very brief introduction to lambdas, which should give you enough information to use basic lambdas in this class. Lambdas have several other useful features; if you're curious to learn more, take a look at this Web page: https://docs.microsoft.com/en-us/cpp/cpp/lambda-expressions-in-cpp?view=msvc-170.

Exercise 2: filter

Open the file filter.cc. This program contains a function print_lines, which will read the lines of a file and print all those that meet certain criteria. When print_lines is invoked, the first argument contains the name of a file to read, and the second argument is a function that is invoked for each line. If that function returns true, then the line will be printed.

When complete, filter.cc must invoke print_lines with a lambda that takes a string as argument (a line of the file) and returns true if the string contains any of several substrings. The file to read and the substrings are provided as command-line arguments to the program, using argc and argv. For example, consider the following command:

./filter print.cc main line

This should print all of the lines in print.cc that contain either main or line as substrings:

void thread_main(int id)
        printf("Thread %d used printf for line %d\n", id, i);
        //std::cout << "Thread " << id << " used <iostream> for line " << i << std::endl;
int main(int argc, char **argv)
        threads[i] = new std::thread([i] {thread_main(i);});

Your job for this exercise is to write the code that invokes print_lines with an appropriate lambda. You will need to use a capture in your lambda in order to access the command-line arguments. For starters, add code that invokes print_lines in a way that prints every line. Then change it so that no lines are printed. Finally, change it so that only the matching lines are printed.

Once you think you have your program running correctly, try running tools/sanitycheck. This will run a few automated tests over your code. If there is a test failure, you can track it down by running the test case by hand. Notice that sanitycheck prints out the command it ran for each test; you can try running the same command by hand, and also run the sample solution in examples/filter_soln with the same arguments. If needed, you can use gdb to step through your code to diagnose the problem.

Exercise 3: print

(You will go through this exercise in section with your CA)

Look over the file print.cc. The main program creates several threads, each of which executes the thread_main function. That function prints 50 lines of output using the C printf function. Each line identifies the printing thread and the line number.

Type make to compile this program, then type
```
./print > print.out
```
This will run the program and redirect its output to the file print.out. Take a look at that file, then run the program again: does it produce the same output each time it is run? Now answer Questions 3A and 3B in questions.txt.
Find the line of code in print.cc that is commented out (it starts with std::cout). Uncomment that line, and comment out the printf line just above it. This will change the code to use the C++ I/O library instead of printf. Invoke make to recompile the program, then run it again, redirecting output to print.out2. Take a look at print.out2 and compare it to print.out generated previously. Then answer Question 3C in questions.txt.
As mentioned in lecture, the behavior of concurrent programs depends on what operations are atomic: an atomic operation appears to execute instantaneously, without interruption. The difference in output from these two versions seems to indicate that different operations are atomic in the two versions of the program. Now answer Question 3D in questions.txt.

Exercise 4: inc

Open the file inc.cc and add code to the function main so that it creates NUM_THREADS child threads, each of which runs the thread_main function, and then waits for all of them to complete (similar to print.cc). Each child thread will increment variable counter 100,000 times, then after all the threads complete, the main thread will print the final value. Before running the program, answer Question 4A in questions.txt.
Now run make to compile the program, then invoke it:
```
./inc
```
How does the answer compare with what you predicted in Question 4A? Answer Question 4B in questions.txt.
In this program the variable counter is shared among all of the threads. That is because it is declared outside any function as a static variable: there is just one copy for the entire program. You can also tell that the variable must be shared by looking at the output; think about why that is the case, then answer Question 4C in questions.txt.

Note: the shared variable must be declared volatile in this example; otherwise the compiler will assume it can make certain optimizations. For example, rather than executing the loop in thread_main 100,000 times, it will just add 100,000 to the value of the variable (try removing the volatile keyword and see what happens to the output, then restore the volatile again). Once we start using better synchronization such as atomics and locks, you won't need to declare variables volatile anymore (nowadays, volatile declarations are generally considered to be a sign of bad coding).
The program's output suggests that some increments of the counter variable are getting "lost". This must mean that increments are not atomic (and they aren't: in this version of the program only reads and writes are atomic). Think about what happens if two threads try to increment the variable at the same time and construct a precise ordering between the threads that causes one of their increments to be lost. Then answer Question 4D in questions.txt.
Find the two commented-out lines of code in inc.cc. Uncomment each of these lines, but comment out the line above each. This will change the counter variable from a simple int to a std::atomic<int>; this new type can be used for anything an int can be used for, but it has special features so it can be used effectively for communication between threads. In particular, it offers atomic operations other than read and write; for example, increments on this variable are atomic (they use special machine instructions that carry out the increment in an atomic fashion).

Recompile the program and run it a few times, then answer Question 4E in questions.txt.

Exercise 5: print2 and print3

This exercise uses shared variables to control the order in which threads execute.

Open the file print2.cc and familiarize yourself with its code. Like print.cc, the main program creates 10 child threads, each of which prints 50 lines of output. In addition, child thread 2 uses the last_printed variable to keep track of its progress. After the main thread creates all the children, it waits until thread 2 has printed 20 lines, and then it prints a message. After that, it waits for all the children to exit.

Invoke make to compile this program, then run it like this:
```
./print2 > print2.out
```
Then look at the output file print2.out to verify that the line "Thread 2 has printed line 20" appears after the line "Thread 2 is printing line 20". Now run the program like this:
```
./print2 | grep "Thread 2" | grep "line 20"
```
In this pipeline, the first grep command will pass only the lines that contain "Thread 2" and the second grep command will pass only the lines that contain "line 20". Run this a few times and then answer Question 5A in questions.txt.
Modify print2.cc so that thread 4 will wait to print each line until the line with the same number has been printed by thread 2. Run the program and check the output file to make sure that the two threads print their lines in the correct order. may find the following command useful:
```
./print2 | egrep 'Thread 2|Thread 4'
```
This will print only the output lines that contain either "Thread 2" or "Thread 4".

In addition, we have provided a script order.py that will check the output precisely. To use it, invoke the following command:
```
./print2 | ./order.py
```
The output from order.py will tell you whether everything is OK; if not, it will identify a specific line in the output that is wrong. Once this test seems to be passing, try it several times to make sure that it always passes (the order of thread execution will vary from run to run). If you invoke tools/sanitycheck it will run this test automatically.
The file print3.cc contains the beginnings of a program. Your challenge is to complete this program so that the threads print in lock-step order: first thread 0 prints line 0, then thread 1 prints line 0, and so on up until thread 9 prints line 0. Then thread 0 prints line 1, thread 1 prints line 1, and so on. Do not change any of the existing code, but feel free to add more code in either main or thread_main, as well as additional variable declarations. Test your code to make sure it works as expected.

Once you think your code works, run tools/sanitycheck: it will check the output of your program against the sample solution. If the print3 test fails and you're having trouble finding where your output differs from the sample solution, invoke the following shell commands:
```
./print3 > my.out
samples/print3_soln > soln.out
diff my.out soln.out
```
The output from the diff program will show you exactly which lines are different.
Find the commented-out line in print3.cc (it starts with std::cout). Uncomment this line, but comment out the line immediately above it. Compile and run this modified version, and take a look at the output. Then answer Question 5B in questions.txt.

Exercise 6: fork_print and exec

(You will go through this exercise in section with your CA)

Open the file fork_print.cc. This program provides a very simple example of using the fork and waitpid system calls discussed in lecture. The fork function is a system call, which means it invokes the operating system to perform an operation. In this case, the operating system creates a new process with a single thread whose state is an identical copy of the state of the calling process. The system call then returns twice (!), once in the parent and once in the child; the return value can be used by each process to identify itself. The child prints a message, increments the variable x, prints the new value, and then exits. The parent invokes waitpid to wait for the child to exit, then prints a message containing the value of x.

Try running the program:

./fork_print

Then answer Question 6A.

Now open the file exec.cc. In practice, after a fork the child almost always invokes the execvp system call; that is the case in this program. execvp replaces all of the state of the calling process with that of a new program, and it starts that program running in the process by invoking its main function. The first argument to execvp is the name of the program (the operating system will search the directories in your path to find an executable file with this name) and the second argument is an array of argument words, which will appear in the child as the argc and argv arguments to the main function. The first word of the arguments is usually the same as the program name.

The exec program is a tiny shell; think of it as a super-simplified version of the shell you use. It reads lines from standard input, splits each line up into words at space characters, and then calls fork and exec to run the specified program in a child process. The parent waits for the child to exit and then reads the next line.

Try out the program by invoking

./exec

then try typing simple shell commands such as the following

ls
echo a b c

This program doesn't support any of the fancy features of a real shell, such as pipes, I/O redirection, variables, or quoting.

Line splitting is performed by the Splitter class in exec.cc. A line can be split by constructing a Splitter object with that line as argument and then invoking the get_words method. get_words will return an array of char* pointers to the words of the line in just the right form needed by execvp.

Exercise 7: exec_many

For this exercise you must add code to the file exec_many.cc to implement the functionality described here. exec_many is similar to exec in that it reads lines from standard input and runs a program in a child process for each line. However, it doesn't run the programs right away. Instead, it reads lines of input and saves them (perhaps in a std::vector??) until it encounters an empty line. When this happens, then exec_many runs all of the accumulated commands, with one child process for each command. Furthermore, it must run the commands in parallel, not sequentially: it should not wait for any of the commands to complete until all of the commands have started. Once all the child processes have exited, exec_many starts reading more lines of input again, until it encounters another empty line, and so on.

The main function in exec_many.cc is currently just a skeleton that reads lines of input and discards them. You must add the functionality described above, then test your program to make sure it works. We don't have any automated tests for this program, so you'll have to think about how to test it on your own. Once you have a working program, answer Question 7A in questions.txt. Note: you don't need to create new threads within exec_many.cc for this exercise; just use fork and execvp.

Submitting

Once you are finished working and have saved all your changes, submit by running tools/submit. Make sure that you have answered all of the questions in questions.txt before submitting.

We recommend you do a trial submission in advance of the deadline to allow time to work through any snags. You may submit as many times as you like; we will grade the latest submission. Submitting a stable but unpolished/unfinished version is like an insurance policy. If the unexpected happens and you miss the deadline to submit your final version, the earlier submit will earn points. Without a submission, we cannot grade your work. You can confirm the timestamp of your latest submission in your course gradebook.

Grading

Here is a recap of the work that will be graded on this assignment:

questions.txt: answer all of the questions.
filter.cc: call print_lines with an appropriate lambda.
print3.cc: make the threads execute in lock-step order.
exec_many.cc: execute several child processes in parallel.

We will grade your code using the provided sanity check tests, possible additional autograder tests, and some hand checks for print3.cc and exec_many.cc. We will also review your code for style grading. Check out our course style guide for tips and guidelines for writing code with good style!