Preliminary instructions for the GSS paper project.
Use the General Social Survey Cumulative File to do a brief analysis. In the paper you will relate the analysis to readings we have done.
We will be using
Basic web link:
Select the General Social Survey (GSS) Cumulative Datafile 1972-2008, or follow this link:
Things to know about the GSS, and to keep in mind:
* The GSS has been fielded every other year from 1972-2008.
* There are hundreds of variables available to study, but not all variables are available for all years. Be sure to tabulate your variable against year to know which years are available. That is, make a table where “year” is the row variable and your variable of interest is the column variable. Run the table to see what years are available for your variable.
* Always check the box for “Question Text” so that the full text of the question appears along with your table and chart.
* If you are dealing with a variable that has lots and lots of levels, you may need to recode the variable to a more manageable set of categories. See http://sda.berkeley.edu/HELPDOCS/helpan.htm#recode
For instance, the feeling thermometers (how do you feel about Catholics...) give respondents a choice of any value from 0 to 100, 100 corresponding to the most positive feeling. If you tabulate a variable like “CATHTEMP” you get a mess. You need to recode it into something like 3 categories (roughly interpreted as don’t like Catholics, don’t mind Catholics, and like Catholics), so CATHTEMP(r: 0-40; 41-60; 61-100).
same recoding syntax works when you are using variables as Controls. If you
want to use
* Missing data codes. Variables like
* Also note: GSS has a variable for year the respondent was born, called COHORT. For some analyses, COHORT is more sensible as a control than AGE. Note that COHORT, YEAR, and AGE are all closely related. You have to think carefully about age (life-course) versus period versus cohort effects when you are trying to explain social change. And also realize that with data like the GSS, you often cannot disentangle age, period, and cohort effects. But do try to keep clear in your own language about which one you are talking about, and why.
* SDA doesn’t seem to care whether you capitalize the variable names or not.
* Percentages are crucial because the sample size of the GSS varies from year to year, and not every question is asked of every respondent. So if you want to know whether Americans’ attitudes about Catholics have changed over time, you want to know what percentage of respondents in each year “didn’t like” Catholics, for instance. If year is your row variable, you should probably be percentaging by row. Look at where the percentages add up to 100%, and ask yourself it that is what you want.
* The weights in GSS are designed to correct for things like household size and for the oversampling of blacks that took place in a couple of rounds of the GSS. The weights don’t make a huge difference, so my advice is to ignore the weights (that is, select “No Weight” in the weight option).
* You can copy the tables and figures that SDA produces directly into your paper (in Windows, it is right-click, copy, then paste). When pasting, please paste as a picture or as a bitmap, because otherwise your graphic may not be viewable when I try to open the file. If in doubt, you can always convert your file to an Acrobat pdf file, which tends to embed all graphics in a readable way. Also, you certainly may retype the output into Excel and that will give you more control over how the figures look.
* You can only control for one variable at a time in SDA, but you can use several variables in combination as filters.
Something to keep in mind about the GSS data: The GSS data are generally cross-sectional data, meaning different people are interviewed in each wave. Cross sectional datasets like the GSS are good at measuring the *prevalence* of phenomena, such as divorce, but not very good at measuring the *incidence* of divorce, or the divorce rate. That is, the GSS can tell us what percentage of US adults had a marital status of divorced at the time of the GSS survey in 1980, but we generally don’t know when those people got divorced, so we cannot infer anything about the annual divorce rate, or even the lifetime divorce rate (because people move back and forth between the divorced and married statuses).
A note about race and ethnicity: The GSS variable for race
A note about age: All respondents in the GSS are at least 18 years old, there are no minors among the respondents. There are questions about other children living in the respondent’s household, and there are questions about the respondents’ experiences when they were children.
1) What is expected of the GSS proposal:
Proposals should be one page Adobe Acrobat documents, emailed to your TA on the due date. For the GSS proposal you should identify one variable you will use, along with year. In your proposal you should provide an embedded table or figure showing how your variable changes over time. If your variable of interest was only asked once, cross tabulate with another variable like race, education, geographic region, age (you might want to recode age into a few categories) or gender. Axes, variables, and categories should be appropriately labeled. Explain briefly what you want to know about the variable in question. Mention what other variable you are thinking of using as a control. When writing about GSS variables, please include the GSS name (i.e. SOCBAR or DIVORCE) as well as the variable description and full text of the question.
Students sometimes ask: “What GSS variables can be studied?” The answer is that you can focus on any variable as long as you believe you will be able to tie the results somehow back to the reading we have done.
2) What is expected of the GSS paper:
Papers should be 3-6 pages of text, plus at least two tables or figures. Convert your paper to Adobe Acrobat to make sure that the graphics are embedded in a way that we can read them. The main thing you want to write about is your chosen GSS variable, and how it varies. This is a short assignment, so you will need to get right to the point. First of all, you paper should respond to and accommodate your TA’s feedback on your proposal. In addition to the table or figure you included in your proposal, you need to include a second analysis, which brings a third variable in as a control. Appropriate controls include things like race, education, geographic region, religion, political party affiliation, age (you might want to recode age into a few categories) or gender, or something else you want to consider. The purpose of the control variable is for you to be able to write something about how the variable you’re interested in varies with respect to the control.
One key is for you to discuss the results as they really are, not as you want them to be or as you expected them to be given the literature. If your results contradict the literature in some way, that is good, explain the discrepancy.
The last page or so of your paper needs to address the literature we have read in the class. How do your findings agree with or disagree with something we have read? The quality of the paper will depend, in part, on how thoughtfully you use your results to reflect on one or more of the readings.
When writing about GSS variables, please include the GSS name (i.e. SOCBAR or DIVORCE) as well as the variable description and full text of the question.