Demand Paging

Optional readings for this topic from Operating Systems: Principles and Practice: Chapter 9.

Overall goal: allow programs to run without all of their information in memory

Keep in memory the information that is being used.
Keep unused information on disk in paging file (also called backing store, or swap space)
Move information back and forth as needed.
Locality: most programs spend most of their time using a small fraction of their code and data.
Ideally: paging produces a memory system with the performance of main memory and the cost/capacity of disk!

When a program is running, each page can be either:

Page Faults

What happens when a process references a page that is in the backing store?

For pages in the backing store, the present bit is cleared in the page map entries.
If present is not set, then a reference to the page causes a trap to the operating system.
These traps are called page faults.
To handle a page fault, the operating system
- Finds a free page frame in memory
- Reads the page in from backing store to the page frame
- Updates the page map entry, setting present
- Resumes execution of the thread

In order for paging to work, hardware must provide two features:

Hardware must save the virtual address that caused the fault (CR2 register in x86).
If the fault occurs in the middle of a complex instruction, hardware must undo any changes made by the instruction, so it can be restarted from the beginning.

Once the basic page fault mechanism is working, the OS has two policy decisions to make:

Most modern OSes use demand fetching:

Start process with no pages loaded, don't load a page into memory until it is referenced.
The pages for a process divide into three groups:
- Read-only code pages: read from the executable file when needed.
- Initialized data pages: on first access, read from executable file. Once loaded, save to the paging file since contents may have changed.
- Uninitialized data pages (stack, malloc): on first access, just clear memory to all zeros. When paging out, save to the paging file.

Prefetching: try to predict when pages will be needed and load them ahead of time to avoid page faults.

Requires predicting the future, so hard to do.
One approach: when taking a page fault, read many pages instead of just one (wins if program accesses memory sequentially).

Once all of memory is in use, will need to throw out one page each time there is a page fault.

Random: pick any page at random (works surprisingly well!)

FIFO: throw out the page that has been in memory longest.

MIN: The optimal algorithm requires us to predict the future.

Least Recently Used (LRU): use the past to predict the future.

Implementing LRU: need hardware support to keep track of which pages have been used recently.

Perfect LRU?
- Keep a hardware register for each page, store system clock into that register on each memory reference.
- To choose page for placement, scan through all pages to find the one with the oldest clock.
- Hardware costs prohibitive in the early days of paging; also, expensive to scan all pages during replacement.
- No machines have actually implemented this.
Current computers settle for an approximation that is efficient. Just find an old page, not necessarily the oldest.

Two extra bits in each page map entry:

Clock algorithm (also called second chance algorithm). To choose page for placement:

Cycle through pages in order (circularly).
If the current page has been referenced, then don't replace it; just clear the reference bit and continue to the next page.
If the page has not been referenced since the last time we checked it, then replace that page.
Optionally: if the dirty bit is set, don't replace this page now, but clear the dirty bit and start writing the page to disk.

How to implement page replacement when there are multiple processes running in the system?

Global replacement:
- All pages from all processes lumped into a single replacement pool.
- Each process competes with all the other processes for page frames.
Per-process replacement:
- Each process has a separate pool of pages.
- A page fault in one process can only replace one of its own frames.
- Eliminates interference from other processes.
- Dilemma: how many page frames to allocate to each process? Poor choice can result in inefficient memory usage.
Most systems use global replacement.

What happens if memory gets overcommitted?

Suppose the pages being actively used by the current threads don't all fit in physical memory.
Each page fault causes one of the active pages to be moved to disk, so another page fault will occur soon.
Just run another thread?
The system will spend all its time reading and writing pages, and won't get much work done.
This situation is called thrashing; it was a serious problem in early timesharing machines with dozens or hundreds of users:
- Why should I stop my processes just so you can make progress?
- System had to handle thrashing automatically: stop running some processes for a while.
With personal computers, users can notice thrashing and kill some processes, so that others can make progress.
Memory is cheap enough that there's no point in operating a machine in a range where memory is even slightly overcommitted; better to just buy more memory.