Below are the midterm questions from the cs249B midterm exam from 2008. 1. Software Engineering: We in software like to believe "software engineering" is really doing engineering. Describe three key problems with the software industry qualifying as doing real engineering. 2. Audit: a. Give an example in audit (pseudo)code of an invariant that is too expensive to check as part of an inline assert. b. Describe three problems with invoking audit routines in general as part of the normal execution of a production application. c. One of your new junior programmers accepts the use of audit for state checking but still wants to use asserts as specification of preconditions. Argue against this use and justify it. 3. Shared Descriptions: a. Sharing large descriptions by providing them as a hidden pointer in a value type is straight-forward and efficient. What's the primary complication? b. PEEVE (pointer equality equals value equality) is challenging to achieve. Describe how to do this, and the costs fo doing so. 4. Concurrency: Consider a data mining application that is processing a "click stream" of click records of users visiting a web site. These records include an indication of a "cookie" to identify a given session/user. The data mining application needs to collect statistics about these session and categories of users/sessions as well as log incremental summaries. Sketch the design of a parallel implementation of this data mining application reflecting the principles of the course. 5. Collections and Iterators: Iterator semantics could, as one choice, be specified as operating on a snapshot of a collection, logically created as a snapshot of the collection at the time the iterator is created. a. Describe the problems with iterator semantics and updates to collections, illustrating with an example. b. Assuming there are concurrent adds and deletes taking place with a collection during the lifetime of an iterator, what are the issues an application may have to deal with using the snapshot model and how does that compare to the "version"-based approach in the notes. c. What are the performance/efficiency issues with these snapshot semantics compared to the version-based approach.