Introduction to Data Analysis                                                                   Rev: 3/29/2019

Sociology 180B/280B

 

Draft Syllabus

 

Class Tuesday+ Thursday, 3P-4:20P

 

Lab/Section once a week time

1) Fridays 10:30A-11:50A

or

2) Thursdays 6P-7:20P

 

Michael J. Rosenfeld

Professor

Department of Sociology

Building 120 room 124

mrosenfe@stanford.edu

The class website is my personal Stanford website

www.stanford.edu/~mrosenfe

Office Hours TBA

 

TAs

Amy Johnson (aljohnson@stanford.edu)

Michael Hahn (mikehahn@stanford.edu)

 

Use Canvas to submit homework

 

 

 

Introduction:

            This class will cover basic statistics including regression, how do statistical analysis, and how to find flaws and problems with statistical analyses.

            In the process of learning about data analysis you will also learn about demography and stratification in the U.S., because the dataset is the Current Population Survey of March, 2000, which is a nationally representative survey of more than 60,000 households, with lots of information about race, gender, income, occupation, place of residence, and so on.  You'll also learn how to use one of the most powerful and flexible tools for data analysis, the statistical software STATA. 

 


Readings and Grading Policy

 

Books (available at Stanford Bookstore):

* Freedman, David, Robert Pisani, and Roger Purves. 2007. Statistics. Fourth Edition. W.W. Norton. $105, ISBN: 0393929728 (recommended). If you know a little about statistics already, or if you have taken one statistics class like Stats 60, you don’t need to buy the Freedman, and you can ignore the Freedman reading assignments.

* Tufte, Edward. 2001. The Visual Display of Quantitative Information. Graphics Press. $28,  0961392142 (required).

 

Other readings will be linked from the class website.

 

 

Software Required (order online)

* Intercooled (IC) Stata, Version 15. You may purchase either a 1 year license for $125,or a perpetual license for $225. I recommend the perpetual license so that you can use this software in the future. The software comes with a small introduction to Stata book. Don’t bother buying Stata’s massive printed reference book collection. I will teach you the Stata commands that you need to know, and the Stata online help is very good.

https://www.stata.com/order/new/edu/gradplans/campus-gradplan/

There are computer clusters at Stanford where you can run Stata for free, and you can run Stata over Unix but with reduced screen feedback. I strongly urge you to buy the Stata license and install it on your personal PC.

 

 

Computer Use Policy:

* Computer use by students in class is strictly limited to following along with the data analysis examples being presented by the professor.


GRADING:

 

1) Undergraduates, Soc 180B:

Homework

4 homeworks, 15% each

Regular section participation

10%

Final exam (based on data analysis part of the course)

30%

 

 

2) Graduate Students Soc 280B

Homework

4 homeworks, 15% each

Regular section participation

10%

In-class presentation (data analysis of dataset of your own choosing) outline

10% (due date to be negotiated with professor Rosenfeld

In-class presentation (data analysis of dataset of your own choosing) actual presentation to class

20% (class presentation date to be negotiated with professor Rosenfeld)

 

 

Project and Reading Assignment Timeline

 

Week

CLASS

Class lecture Goals

READINGS; REQUIRED READINGS IN BOLD.

ASSIGNMENT

1

Apr 2

Introduction to the class

 

 

 

Apr 4

Basics of descriptive data analysis using STATA

Read my Intro to Stata (required)

Read Freedman Ch 4

Hand out HW#1

 

Section

Work on HW 1 and on using STATA

 

 

 

 

 

 

 

2

Apr 9

Observational Studies and their limitations

Freedman Ch 2

 

 

Apr 11

Error and bias

Freedman Ch 6

 

 

Section

Work on HW 1 and on using STATA

 

Friday, April 12, HW 1 due at midnight

 

 

 

 

3

Apr 16

Error and bias

Freedman Ch 6

Hand out HW#2

 

Apr 18

Probability sampling, Sample size and power, and standard errors

Freedman Ch 20

 

 

Section

Stata, and HW 2

 

 

 

 

 

 

 

4

Apr 23

More on sample size and power.

Freedman Ch 21

 

 

Apr 25

Statistics and hypothesis testing

 

 

 

Section:

Stata, and HW 2

 

 

 

 

 

 

Friday, Apr 26, HW#2 due by midnight

5

Apr 30

Introduction to regression with STATA

Freedman Chs 9, 10

Hand out HW#3

 

May 2

More on regression with STATA, interpreting coefficients

Freedman, Ch 11, 12

 

 

Section

Work on HW #3

 

 

 

 

 

 

 

6

May 7

Problems with and difficulties in using regression, Graphing.

Tufte, read the whole book (required)

 

 

May 9

Proper and improper presentation of data

 

 

Section

Work on HW #3

 

 

 

 

 

 

Friday, May 10, HW#3 due by midnight

7

May 14

Additivity, linearity, and regression fits

 

Hand out HW #4

 

May 16

Regression analysis: residuals and outliers

Readings by Jasso and Kahn and Udry, and Jasso’s response posted on my website (all required)

 

 

Section

Work on STATA, discuss the issues in HW 4

 

 

 

 

 

 

 

8

May 21

Logistic regression

 

 

May 23

Logistic regression and the likelihood ratio test

 

 

Section

Work on STATA, discuss the issues in CPS HW #4

 

 

 

 

 

 

 

9

May 28

Polls, polling aggregation, and election prediction

 

 

 

May 30

Soc 280B in-class presentations

 

 

 

 

HW #4 due

Friday, May 31 by midnight

 

Section

Work on STATA, discuss the issues in HW 4

 

 

 

 

 

 

 

10

June 4

Final Exam Review

 

 

June 6

No class

 

 

no section meetings

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Final exam

Saturday, June 8, 3:30P-6:30P