Thrashing and Working Sets

Lecture Notes for CS 140
Winter 2012
John Ousterhout

  • Readings for this topic from Operating System Concepts: Sections 9.6, 9.10.
  • Normally, if a thread takes a page fault and must wait for the page to be read from disk, the operating system runs a different thread while the I/O is occurring. Thus page faults are "free"?
  • What happens if memory gets overcommitted?
    • Suppose the pages being actively used by the current threads don't all fit in physical memory.
    • Each page fault causes one of the active pages to be moved to disk, so another page fault will occur soon.
    • The system will spend all its time reading and writing pages, and won't get much work done.
    • This situation is called thrashing; it was a serious problem in early demand paging systems.
  • How to deal with thrashing?
    • If a single process is too large for memory, there is nothing the OS can do. That process will simply thrash.
    • If the problem arises because of the sum of several processes:
      • Figure out how much memory each process needs.
      • Change scheduling priorities to run processes in groups that fit comfortably in memory: must shed load.
  • Working Sets: conceptual model proposed by Peter Denning to prevent thrashing.
    • Informal definition: the collection of pages a process is using actively, and which must thus be memory-resident to prevent this process from thrashing.
    • If the sum of all working sets of all runnable threads exceeds the size of memory, then stop running some of the threads for a while.
    • Divide processes into two groups: active and inactive:
      • When a process is active its entire working set must always be in memory: never execute a thread whose working set is not resident.
      • When a process becomes inactive, its working set can migrate to disk.
      • Threads from inactive processes are never scheduled for execution.
      • The collection of active processes is called the balance set.
      • The system must have a mechanism for gradually moving processes into and out of the balance set.
      • As working sets change, the balance set must be adjusted.
  • How to compute working sets?
    • Denning proposed a working set parameter T: all pages referenced in the last T seconds comprise the working set.
    • Can extend the clock algorithm to keep an idle time for each page.
    • Pages with idle times less than T are in the working set.
  • Difficult questions for the working set approach:
    • How long should T be (typically minutes)?
    • How to handle changes in working sets?
    • How to manage the balance set?
    • How to account for memory shared between processes?
  • Page Fault Frequency: another approach to preventing thrashing.
    • Per-process replacement; at any given time, each process is allocated a fixed number of physical page frames.
    • Monitor the rate at which page faults are occurring for each process.
    • If the rate gets too high for a process, assume that its memory is overcommitted; increase the size of its memory pool.
    • If the rate gets too low for a process, assume that its memory pool can be reduced in size.
    • If the sum of all memory pools don't fit in memory, deactivate some processes.
  • In practice, today's operating systems don't worry much about thrashing:
    • With personal computers, users can notice thrashing and handle it themselves:
      • Typically, just buy more memory
      • Or, manage balance set by hand
    • Thrashing was a bigger issue for timesharing machines with dozens or hundreds of users:
      • Why should I stop my processes just so you can make progress?
      • System had to handle thrashing automatically.
    • Technology changes make it unreasonable to operate machines in a range where memory is even slightly overcommitted; better to just buy more memory.