Stanford University
Computer Science 249B: Winter 2007 - 2008

CS249B Midterm 2008


Feb 7th 2008. Closed book exam.

You should complete the exam yourself without assistance from others. It is a violation of the honor code to receive assistance from others or to copy material without proper attribution.

Feel free to use point form. Each question can be answered satisfactorily in a page of writing or less. I am primarily interested in the most key points responsive to each question. Each question is weighted equally.


  1. Software Engineering: We in software like to believe "software engineeing" is really doing engineering. Describe three key problems with the software industry qualifying as doing real engineering.
  2. Audit:
    1. Give an example in audit (pseudo) code of an invariant that is too expensive to check as part of an inline assert.
    2. Describe three problems with invoking audit routines in general as part of the normal execution of a production application.
    3. One of your new junior programmers accepts the use of audit for state checking but still wants to use asserts as specification of preconditions. Argue agains this use and justify it.
  3. Shared Descriptions:
    1. Sharing large descriptions by providing them as a hidden pointer in a value type is straight-forward and efficient. What's the primary complication?
    2. PEEVE (pointer equality equals value equality) is challenging to achieve. Describe how to do this, and the costs of doing so.
  4. Concurrency: Consider a data mining application that is processing a "click stream" of click records of users visiting a web site. These records include an indication of a "cookie" to identify a given session/user. The data mining application needs to collect statistics about these sessions and categories of users/sessions as well as log incremental summaries. Sketch the design of a parallel implementation of this data mining application reflecting the principles of the course.
  5. Collections and Iterators: Iterator semantics could, as one choice, be specified as operating on a snapshot of the collection, logically created as a snapshot of the collection at the time the iterator is created.
    1. Describe the problems with iterator semantics and updates to collections, illustrating with an example.
    2. Assuming there are concurrent adds and deletes taking place with a collection during the lifetime of an iterator, what are the issues an application may have to deal with using the snapshot model and how does that compare to the "version"-based approach in the notes.
    3. What are the performance/efficiency issues with these snapshot semantics compared to the version-based approach.

    The End