Thrashing and Working Sets

Lecture Notes for CS 140
Winter 2012
John Ousterhout

Readings for this topic from Operating System Concepts: Sections 9.6, 9.10.
Normally, if a thread takes a page fault and must wait for the page to be read from disk, the operating system runs a different thread while the I/O is occurring. Thus page faults are "free"?
What happens if memory gets overcommitted?
- Suppose the pages being actively used by the current threads don't all fit in physical memory.
- Each page fault causes one of the active pages to be moved to disk, so another page fault will occur soon.
- The system will spend all its time reading and writing pages, and won't get much work done.
- This situation is called thrashing; it was a serious problem in early demand paging systems.

How to deal with thrashing?
- If a single process is too large for memory, there is nothing the OS can do. That process will simply thrash.
- If the problem arises because of the sum of several processes:
  - Figure out how much memory each process needs.
  - Change scheduling priorities to run processes in groups that fit comfortably in memory: must shed load.
Working Sets: conceptual model proposed by Peter Denning to prevent thrashing.
- Informal definition: the collection of pages a process is using actively, and which must thus be memory-resident to prevent this process from thrashing.
- If the sum of all working sets of all runnable threads exceeds the size of memory, then stop running some of the threads for a while.
- Divide processes into two groups: active and inactive:
  - When a process is active its entire working set must always be in memory: never execute a thread whose working set is not resident.
  - When a process becomes inactive, its working set can migrate to disk.
  - Threads from inactive processes are never scheduled for execution.
  - The collection of active processes is called the balance set.
  - The system must have a mechanism for gradually moving processes into and out of the balance set.
  - As working sets change, the balance set must be adjusted.
How to compute working sets?
- Denning proposed a working set parameter T: all pages referenced in the last T seconds comprise the working set.
- Can extend the clock algorithm to keep an idle time for each page.
- Pages with idle times less than T are in the working set.
Difficult questions for the working set approach:
- How long should T be (typically minutes)?
- How to handle changes in working sets?
- How to manage the balance set?
- How to account for memory shared between processes?
Page Fault Frequency: another approach to preventing thrashing.
- Per-process replacement; at any given time, each process is allocated a fixed number of physical page frames.
- Monitor the rate at which page faults are occurring for each process.
- If the rate gets too high for a process, assume that its memory is overcommitted; increase the size of its memory pool.
- If the rate gets too low for a process, assume that its memory pool can be reduced in size.
- If the sum of all memory pools don't fit in memory, deactivate some processes.
In practice, today's operating systems don't worry much about thrashing:
- With personal computers, users can notice thrashing and handle it themselves:
  - Typically, just buy more memory
  - Or, manage balance set by hand
- Thrashing was a bigger issue for timesharing machines with dozens or hundreds of users:
  - Why should I stop my processes just so you can make progress?
  - System had to handle thrashing automatically.
- Technology changes make it unreasonable to operate machines in a range where memory is even slightly overcommitted; better to just buy more memory.

CS 140: Operating Systems (Winter 2012)

Thrashing and Working Sets