Introduction to Data Analysis for Sociology Graduate Students

rev: 9/14/2012

Syllabus 

Fall Quarter, 2012

Tuesdays and Thursdays

2:15-3:30

Education building, room 130

 

Lab/Section once a week time and place TBA

 

Michael J. Rosenfeld

Associate Professor

Department of Sociology

Building 120 room 124

mrosenfe@stanford.edu

The class website is my personal Stanford website

www.stanford.edu/~mrosenfe

Office Hours by appointment

 

TAs:

Anna Lunn (alunn@stanford.edu)

Kate Weisshaar (weisshaar@stanford.edu)

 

 

Introduction:

            In this class you will teach yourself basic statistics including regression, how do statistical analysis, and how to find flaws and problems with statistical analyses.

            In the process of learning about data analysis you will also learn about demography and stratification in the U.S., because the dataset is the Current Population Survey of March, 2000, which is a nationally representative survey of more than 60,000 households, with lots of information about race, gender, income, occupation, place of residence, and so on.  You'll also learn how to use one of the most powerful and flexible tools for data analysis, the statistical software STATA. 

 


Readings and Grading Policy

 

Books required (available at Stanford Bookstore):

* Tufte, Edward. 2001. The Visual Display of Quantitative Information. Graphics Press. ISBN-10: 0961392142. $30

* Statistics with Stata ver 12, by Lawrence C. Hamilton. 2013, Brooks/Cole. ISBN-10: 0840064632. $75

 

Recommended Books:

* Mathematical Statistics and Data Analysis, by John Rice, Duxbury Press, 3rd edition 2006, ISBN-10: 0534399428. $175

* Freedman, David, Robert Pisani, and Roger Purves. 2007. Statistics. Fourth Edition. W.W. Norton. $125. ISBN-10: 0393929728

 

 

The most important readings for the class are the Excel files, Stata logs, and PDF documentation posted on my website. Aside from the Tufte book, which we will be talking about specifically in class, the other books are all supplementary. That is, you don’t need the books. This is briefly why you should own the books anyway:

* Hamilton is a good practical book about how to get things done in Stata. Even experienced Stata hands will learn something from this book.

* Freedman is a classic introductory text about statistics, with no math, but with very good plain English explanations. If you don’t have a math background, Freedman’s explanations may be helpful to you. If you do have a math background, the Freedman may help you explain statistics to other people. And if you end up teaching undergraduate statistics in the future, you may be teaching from Freedman.

* Rice is a classic introduction to statistics for readers who have at least a modest familiarity with calculus. Rice offers outlines of proofs and lots of great problems you can work through on your own. Rice is a great reference book that you should have on your shelf if you plan on doing any data analysis.

 

 

Software Required

* You will need Stata in order to do the homework for Soc 381. You have several options:

1) The least easy and the least palatable is to use Stata over Unix. This is free but very cumbersome

2) Stata is installed in the graduate student computer cluster, running on Windows PCs. This is a good solution, except that you won’t have access to Stata in class or when you are home.

3) The option that offers the most convenience, but also costs the most, is for you to buy a licence for Intercooled (IC) Stata, Version 12. You may purchase either a 1 year license for $98,or a perpetual license for $179. I recommend the perpetual license so that you can use this software in the future. The software comes with a small introduction to Stata book. Don’t bother buying Stata’s massive printed reference book collection for this class. I will teach you the Stata commands that you need to know, and the Stata online help is very good.

http://www.stata.com/order/new/edu/gradplans/gp-direct.html

Note that the Graduate Student Computer Lab currently runs Stata version 10, Professor Rosenfeld will be using Stata version 11, and students who buy Stata will be getting Stata version 12. These versions work pretty much the same way.

           

 

 

Computer Use Policy:

* Computer use by students during class is strictly limited to following along with the data analysis examples being presented by the professor.

 

 

 

Grading:

Project 2 (Data analysis and interpretation)

homework

4 homeworks, 10% each

Regular section participation

5%

In-class presentation (data analysis of dataset of your own choosing) outline

5%

In-class presentation (data analysis of dataset of your own choosing) actual presentation to class

20%

Final Exam

30%

 

 

 

Project and Reading Assignment Timeline

 

Week

CLASS

Class lecture Goals

READING (Readings in bold are required and will be discussed specifically in that class. Other readings are supplementary)

ASSIGNMENT

1

Sept 25

Introduction to Stata and Data Analysis Section

 

 

 

Hand out CPS HW #1

 

Sept 27

Basics of descriptive data analysis using STATA

Read Hamilton, chapters 1 and 2 (Introduction, and Data Management) and Hamilton’s Chapter 5 (Summary Statistics and Tables). Read Rosenfeld’s online Stata guide

 

 

 

section

Work on HW 1 and on using STATA

 

 

 

 

 

 

 

2

Oct 2

Observational Studies and their limitations

Freedman Ch 2, 4

 

 

Oct 4

Error and bias

Freedman Ch 6

HW #1 due

Hand out HW#2

 

section

Stata, and HW 2

 

 

 

 

 

 

 

3

Oct 9

Probability sampling, Sample size and power, and standard errors

Freedman Ch 20;

read also Hamilton, Ch 7, Linear Regression Analysis;

Rice, ch. 6

 

 

Oct 11

More on sample size and power.

Freedman Ch 21

Rice, p. 398-411

 

 

section

Work on STATA, discuss the issues in HWs 2 and 3

 

 

 

 

 

 

 

4

Oct 16

Introduction to regression with STATA

Freedman Chs 9, 10

HW #2 Due

Hand out HW#3

 

Oct 18

More on regression with STATA, interpreting coefficients

Freedman, Ch 11;

Rice ch. 14

 

 

section

Work on STATA, discuss the issues in CPS HW #3

 

 

 

 

 

 

 

5

Oct 23

Problems with and difficulties in using regression, Graphing.

Freedman Ch 12

 

 

 

Oct 25

More limitations of regression analysis

Tufte, P. 1-87

HW #3 due

Hand out HW #4

 

section

Work on STATA

 

 

 

 

 

 

 

6

Oct 30

Proper and improper presentation of data

Tufte, P. 90-190, Hamilton on regression diagnostics, 192-209

 

 

Nov 1

Regression analysis: residuals and outliers

Readings by Jasso and Kahn and Udry, posted on class website

 

 

section

work on HW 4

 

 

 

 

 

 

 

7

Nov 6

Other topics, including logistic regression

Hamilton Ch 9, logistic regression;

Rice p. 253-268

 

 

Nov 8

More on logistic regression

 

HW #4 due

 

 

 

 

 

 

 

 

 

 

8

Nov 13

Some additional, and advanced topics

 

 

 

Nov 15

Some additional, and advanced topics

 

 Presentation Proposals Due

 

 

 

 

 

 

Week of Nov 19-23

Thanksgiving break

 

 

 

 

 

 

 

9

Nov 27

Some additional, and advanced topics/ Student Presentations

 

 

 

Nov 29

Student Presentations

 

 

 

 

 

 

 

 

 

 

 

 

10

Dec 4

Student Presentations, and some Final Exam Review

 

 

 

Dec 6

Student Presentations, and some Final Exam Review

 

 

 

 

 

 

 

 

 

 

 

 

Final Exam

 

Final Exam at the regularly schedule time and place