| Winter 2008 Midterm Solutions |
| Problems | Grader |
|---|---|
| 1,2 | Derrick |
| 3 | Marius |
| 4, 7 | Varun |
| 5,6 | Mendel |
| 8 | David |
| 9, 10 | Megan |
A local replacement policy means that a page fault in one process only replaces one of its own frames, rather than any frame in the system.
The "clock" algorithm is essentially a circular list of pages and a "hand", or a pointer to the next one to consider for eviction. The clock approximates the LRU algorithm because when a page fault occurs and the kernel needs to free a frame to give to the process, it first looks at the page pointed to by the hand. It checks the reference bit, and if it is cleared, it means the page has not been accessed for a while so it selects it for eviction and allocates that frame for the new page. If the reference bit is set then we know the page has been accessed recently. Therefore we advance the hand to point to the next page in memory and consider that one for eviction. We continue this until we find a page that has its reference bit cleared and thus we have our needed frame.
A local replacement policy with one clock per process does make sense. Here the clock algorithm is implemented with one clock per process in order to efficiently achieve a local replacement policy. Each process has its own list of pages and its own clock hand that points to the next page to consider for eviction. Thus when a page fault occurs it will iterate over the list of its own pages, find one with its reference bit cleared, and select it for eviction.
Function1 appears to take longer than Function2 because Function 1:
The catch is that time_sleep(1) really means "sleep until next tick". Therefore, assuming minimal overhead from the loop, what Function1 actually does is:
a. Programs that make use of shared libraries do not need to contain the routines of the library within their executable. Instead, the library is present on the disk once and shared by all programs that make use of it.
b. The answer that we were looking for was that using libraries segments your code into different pages and can cause more pages to be used than necessary, which increases the number of PTE's and increases the memory used by page tables.
A common wrong answer was that static shared libraries always require the entire library to be mapped into the page table. This is not true because the address space is the same for every program and thus, no explicit mapping is necessary. Once the program seg faults, the linker will bring the library code into the set address.
a. Overhead will decrease. Reason: By increasing the Page Size, the number of pages needed to refer to the process decrease. So, there are lesser number of entries in the TLB for the same process. This makes the Miss Rate go down thereby causing the overhead to decrease.
Note: Some students mentioned that the miss penalty will increase as the time required to service a miss increases(as now we bring a larger page into the memory). This will cause a increase in the overhead. This was also an acceptable answer.
b. Overhead will decrease. Reason: By moving from a hierarchical page table to a flat page table, in case of a miss the number of lookups needed to service the miss reduce, thereby reducing the miss penalty.
c. No Change. Reason: This does not change the miss rate or miss penalty.
Although the approach of dividing the CPU usage of each process by 2 every second worked well for small numbers of processes, it had problems with large number of processes (i.e. high load averages) where each process only gets the CPU infrequently. Since each process only accumulated CPU time slowly, the rapid decay of dividing by 2 every second resulted in almost all the processes recording zero usage. This defeated the attempt of the scheduler to distinguish CPU-bound and I/O-bound jobs using CPU usage. The result is every process has the same priority and the scheduler behaved like a round-robin scheduler.
There are two major benefits of having a queue per processor:
(1) Processor affinity based scheduling, where the CPU scheduler tries to keep a process on the same CPU to reuse state in the caches and TLB, is easily supported in the queue per processor model.
(2) Having only a single processor frequently querying a queue resulted in lower synchronization overhead than having a single queue being accessed by multiple processors. This particularly true as the number of processors is increased.
If a process exhausts its time slice without blocking or exiting, the MLFQS scheduler will place it in a lower priority queue with double the current time slice.
Note: A lot of students mentioned that by increasing the priority (either by low recent cpu or low nice value) the process can be given longer time slice.H owever this is not correct. The process will be run more often but will not have a longer timeslice than it actually had. Points were deducted for such a reasoning.
If you mentioned that lowering the priority can make a process get a longer timeslice, but did not mention how a process can go about doing that, full credit was not given.
a. An interrupt handler is not allowed to block. If the semaphore is at 0 and the interrupt handler attempted to down it, the interrupt handler would need to block. If it blocks it could cause a deadlock (Pintos detects if an interrupt handler attempts to block and instead causes a kernel panic).
b. Yes, monitors would have the same problem because they use semaphores or locks to enforce their synchronization. So they could still force the interrupt handler to block and cause a deadlock.
c. No, because wait free synchronization does not involve blocking. It makes of copy of the data structure it wants to modify and modifies the copy. It then tries to atomically replace the real data structure with the modified copy using an atomic compare-and-swap instruction. If the real data structure had changed since it made the copy, then it throws away the modified copy and repeats the process until it successfully updates the real data structure.
Hoare semantics required that a thread signaling a condition explicitly pass the monitor lock to the woken-up thread and run it immediately. This guaranteed that the condition would still be valid when the woken-up thread ran. Therefore, it was not necessary to re-check the condition after calling wait().
This proved difficult to implement, so modern monitor implementations use Mesa semantics. The woken-up thread or threads are simply placed back on the ready list, and the monitor lock is retained by the signaling thread. There is no guarantee that the lock will be obtained by the woken-up thread, as it could be obtained by some other thread in the meantime, who changes the condition. Therefore, the condition needs to be re-checked by the woken-up thread after returning from wait().
Note that it is not true that the loop is to force the thread to re-acquire the monitor lock. wait() automatically releases, sleeps, and re-acquires the monitor lock in both implementations.
The problem is not limited to cond_broadcast(), or to multi-processor/thread systems.
When the scheduler is invoked, it can examine the running thread's Program Counter and see if it is between any of the first/last instructions of any critical section. If so, it should continue running the same thread.
Requiring the system to set a do-not-preempt flag every time it entered or left a critical section would require significant hardware support for this system, since the hardware would have to compare the current PC with a whole bunch of address ranges in a single cycle.
Note that it is not enough to make sure two threads are not running in the same critical section. Any thread running in a critical section is expecting that code to run atomically, meaning no preemption. Two critical sections may touch the same shared data and therefore the scheduler should not interleave them.