Threads

Threads, Processes, and Dispatching

Optional readings for this topic from Operating Systems: Principles and Practice: Chapter 4.

Threads and Processes

How to manage concurrency (many things happening at once)?

Thread: a piece of code executing sequentially on single core

  • Executes a series of instructions in order (only one thing happens at a time).
  • Concurrent activities can be implemented with a collection of threads, each of which is sequential.

Execution state: everything that can affect, or be affected by, a thread:

  • Code, data, registers, call stack, open files, network connections, time of day, etc.

Process: one or more threads, along with their execution state.

  • Part is shared among all threads in the process
  • Part of the process state is private to a thread
  • Most processes have only one thread, and early OSes only allowed one thread per process.
  • Why allow multiple threads per process?

Thread analogy: videos

Process Creation

To create a new process, invoke system call: like a method call except that it invokes code in the operating system.

Linux system calls for process management:

  • fork makes copy of current process, with one thread: returns twice!
    • How can the parent and child tell which is which?
  • exec replaces memory with code and data from a given executable file. Doesn't return ("returns" to starting point of new program).
  • waitpid waits for a given process to exit.
  • Example:
    int child_pid_or_zero = fork();
    if (child_pid_or_zero == 0) {
        /* Child process  */
        execvp("ls", argv);
    } else {
        /* Parent process */
        waitpid(child_pid_or_zero, &status, options);
    }
    
  • Advantage: can modify process state before calling exec (e.g. change environment, open files).
  • Disadvantage: wasted work (most of forked state gets thrown away).

System calls for process management in Windows:

  • CreateProcess combines fork and exec:
    BOOL CreateProcess(
        LPCTSTR lpApplicationName,
        LPTSTR lpCommandLine,
        LPSECURITY_ATTRIBUTES lpProcessAttributes,
        LPSECURITY_ATTRIBUTES lpThreadAttributes,
        BOOL bInheritHandles,
        DWORD dwCreationFlags,
        PVOID lpEnvironment,
        LPCTSTR lpCurrentDirectory,
        LPSTARTUPINFO lpStartupInfo,
        LPPROCESS_INFORMATION lpProcessInformation
    );
    
  • Must pass arguments for any state changes between parent and child.
  • WaitForSingleObject waits for a child to complete:
    WaitForSingleObject(lpProcessInformation->hProcess,
        INFINITE);
    

Thread Creation

To create a new thread, use the C++ standard library (it will invoke system calls):

#include <thread>
std::thread t(func);
...
t.join();

void func() {
    /* This code runs concurrently with the code above.
}
  • join will wait for the child thread to exit (when func returns)

Dispatching

Almost all computers today can execute multiple threads simultaneously:

  • Each processor chip typically contains multiple cores
  • Each core contains a complete processor (or CPU: Central Processing Unit) capable of executing a thread
  • Many modern processors support hyperthreading: each physical core behaves as if it is actually two cores, so it can run two threads simultaneously (e.g. execute one thread while the other is waiting on a cache miss).
  • For example, a server in a datacenter might contain 2 Intel processor chips, each with 24 cores, where each core supports 2-way hyperthreading. Overall, this server can run 96 threads simultaneously.

Typically have more threads than cores

At any given time, most threads do not need to execute (they are waiting for something).

OS uses a process control block to keep track of each process:

  • Saved execution state for each thread (saved registers, etc.)
  • Scheduling information
  • Information about memory used by this process
  • Information about open files
  • Accounting and other miscellaneous information

At any given time a thread is in one of 3 states:

  • Running
  • Blocked: waiting for an event (hard drive I/O, incoming network packet, etc.)
  • Ready: waiting to be scheduled on a core

Dispatcher: innermost portion of the OS that runs on each core:

  • Let a thread run for a while
  • Save its execution state
  • Load state of another thread
  • Let it run ...

Context switch: changing the thread currently running on a core by first saving the state of the old thread, then loading the state of the new thread.

What causes the dispatcher to run?

  • Process blocks (invoked explicitly)
  • But what if a thread is executing? OS has lost control.
  • Interrupts/Traps

Interrupts (events occurring outside the current thread that cause a state switch into the kernel):

  • Character typed at keyboard.
  • Completion of disk operation.
  • Timer: to make sure OS eventually gets control.

Traps (events occurring in current thread that cause a change of control into the kernel):

  • System call.
  • Error (illegal instruction, addressing violation, etc.).
  • Page fault.

The dispatcher is not itself a thread

  • It is just code that is invoked to perform the dispatching function

How does the dispatcher decide which thread to run next (assuming just one core)?

  • Simplest approach: Link together the ready threads into a queue. Dispatcher grabs first thread from the queue. When threads become ready, insert at back of queue.
  • More complex/powerful: give each thread a priority, organize the queue according to priority. Or, perhaps have multiple queues, one for each priority class.