Stats 300A: Theory of Statistics

Lester Mackey, Stanford University, Fall 2015

Lectures

Tuesday and Thursday, 1:30 - 2:50 PM in Mitchell Earth Sciences Building, Room B67.

Course Staff Email Address

stats300a-aut1516-staff@lists.stanford.edu

Instructor

Lester Mackey. Office hours: Thurs. 12:20 - 1:20 PM, 3 - 4 PM, 141 Sequoia Hall.

Teaching Assistants

Kelvin Gu. Office hours: Mon. 1:30 - 3:30 PM, 216 Sequoia Hall.

Feng Ruan. Office hours: Weds. 4:30 - 6:30 PM, 207 Sequoia Hall.

Qingyuan Zhao. Office hours: Tues. 12:20 - 1:20 PM, 3 - 4 PM, 206 Sequoia Hall.

Prerequisites

Real analysis, introductory probability (at the level of Statistics 116), and introductory statistics.

Texts

Grading

Your grade will be determined by scribing (2%), weekly problem sets (38%), a midterm (20%), and a final exam (40%).

Scribing

In order to gain experience with technical writing, each student will be required to prepare scribe notes for a single lecture. After taking careful notes in class, the scribes for a given lecture will jointly prepare a LaTeX document (using this style file and this template) written in full prose understandable to a student who may have missed class. The LaTeX document, along with any image or auxiliary files, should be submitted to the staff list within two days (excluding weekends) of the scribed lecture. After review, the scribe notes will be posted to the course website.

Please sign up to scribe a specific lecture using this spreadsheet.

You will find the LaTeX scribe notes, style file, and any supporting image files from last year's edition of the course on the Stats 300A Coursework site (in the Materials section). You are encouraged to build off of and improve these notes rather than starting from scratch. Take special note of components of the notes that are inadequately explained or motivated and of material that has changed from last year.

Problem Sets

Problem sets posted on the class website will be due in class on Thursdays at the start of lecture. If you are traveling, you may email your solution to one of the course staff in advance of the deadline. Ten percent of the homework value will be deducted for each day a homework is late. Exceptions will be made for documented emergencies. No credit will be given for homework submitted after solutions have been posted.

After attempting the problems on an individual basis, each student may discuss a homework assignment with up to two classmates. However, each student must write up his/her own solutions individually and explicitly name any collaborators at the top of the homework.

Please keep in mind the university honor code.

Midterm

The midterm will be held in our normal classroom during our normal class time on Thurs. Oct. 29. You will have 80 minutes to complete the midterm, which will cover material up to but not including minimaxity. During the midterm, you may refer to a single-sided 8.5 x 11in sheet of notes. You may not refer to any other notes, your textbook, your laptop, the internet, or any other outside resources. Unless a problem explicitly states otherwise, you will be free to cite results proved in class, in the textbook, or on your problem sets.

Final

The final will be distributed online through the course website at 10AM on Mon. Dec. 7 and must be returned by 10AM on Weds. Dec. 9. You may refer to your notes and your textbooks during the exam. You should not need to access the internet to answer any exam questions, and using the internet to search for solutions to these problems is cheating and in violation of our honor code. However, you may use Wikipedia to look up the form of unfamiliar distributions. Unless a problem explicitly states otherwise, you will be free to cite results proved in class, in the textbook, or on your problem sets.

Course Overview

How do you estimate the preferences of a population, optimally?

Is there an optimal test for drug efficacy?

How do you construct an optimal confidence interval?

In Stats 300A, we will examine such questions of optimal statistical inference through the lens of finite sample theory. By the quarter's end, students will have learned to

  1. Formalize new estimation and hypothesis testing tasks as decision theoretic problems,

  2. Compress data optimally using sufficient and complete statistics,

  3. Design optimal decision procedures under the constraints of unbiasedness and equivariance,

  4. Construct optimal estimators under worst-case (minimax) and average-case (Bayesian) criteria, and

  5. Develop optimal hypothesis tests and confidence intervals, when they exist.

Course Topics

Decision Theory

Loss, Risk, Admissibility

Principles of Data Reduction

Sufficiency, Ancillarity, Minimal Sufficiency, Completeness

Statistical Models

Exponential Families, Group Families, Nonparametric Families

Finite Sample Theory of Point Estimation

Minimum Risk Unbiased Estimation, Minimum Risk Equivariant Estimation, Bayes Estimation, Minimax Estimation

Finite Sample Theory of Hypothesis Testing and Confidence Intervals

Neyman-Pearson Theory, Uniformly Most Powerful (UMP) Tests and Uniformly Most Accurate Confidence Intervals, UMP Unbiased Tests, UMP Invariant Test