GETING STARTED You can download a 14-day evaluation copy of SPSS for Windows from www.spss.com. Or you can use copies installed on the Leland systems computers (the elaines) or the Linguistics department student computer cluster. This tutorial will assume that you've already installed SPSS. 1) Start SPSS. 2) Load the data file. a) Go to File | Read Text Data. Choose the right file name. b) You'll enter the Text Import Wizard. In step 1 of 6, it asks "Does your text file match a predefined format?" Choose "No". Then press "next". c) In step 2, where it asks "How are your variables arranged?" choose "Delimited". Where it asks "Are variable names included at the top of your file?" choose "no". Then press "next". d) In steps 3 and 4, leave everything as is and press "next". e) In step 5, you get to name your variables. In turn, click on each column, go to the box that says "Variable name", and enter a name for the variable. In our case, I used the following names: Column 1: Deletion Column 2: POS (part of speech) Column 3: Environment Column 4: class (for social class) After you name a variable, you can also choose the data format -- basically the relevant choices will be whether they are numbers or strings. In our case, you should make sure that "class" has data format "Numeric" so that SPSS interprets its values (1 through 4) as numbers. Hit "next". f) In step 6 you have the option of saving this set of choices so that you can quickly load another file with exactly the same data format. We'll ignore this for now. Then hit "finish". 3) You should now have a spreadsheet-type view of the data file, with the first four columns being "Deletion", "POS", "Environmnent", "class". You can do a lot of manipulations (editing cells, cutting and pasting cells or groups of cells) directly on your data in Excel-type fashion. Take this opportunity to save your data to disk. You can then re-open the file directly from the File | Open | Data menu selection, without going through the whole "Read Text Data" process again. 4) You're now ready to do all sorts of analysis on the dataset. Most of the important options are under the "Analyze" menu in Windows (it's the "Statistics" menu in UNIX). For example, click on Analyze | Descriptive Statistics | Frequencies and then move all the variables from the left side to the right side by clicking on them one by one and pressing the button with the right-pointing arrow. Then click "OK". An "Output" window will pop up with tables for frequencies of each value for each variable. From these tables you can see things like: o 57.6% of the tokens have /-s/ deletion. o few of the tokens are verbs (1.3%) o the majority of the tokens precede a consonant (63.3%) 5) Since we're interested in the effects of the second-, third-, and fourth-column variables (part of speech, following environment, and social class) on the first variable (/-s/ deletion), we need to look at the frequency of deletion for different values of the other variables. From the menu, select Analyze | Descriptive Statistics | Crosstabs and move Deletion to the box under "Columns" and move POS to the box under "Rows". (I think you get the most readable layout for this data if you choose to list the independent variable values down the side of the table, and the dependent variable values across the top, especially because the dependent variable has only two values.) By default, the table will include only absolute counts, but it's nicer to see frequencies too, so click on "Cells" and then click on "Rows" under "Percentages" in the window that comes up, then click Continue. Finally, click OK, and in the Output window you will get a table of final /-s/ deletion frequency as it varies with the word's part of speech. You can see that deletion is most frequent (70.3%) for nouns, least frequent (33.6%) for verbs. If you follow all the steps above but use Environment or Class instead of POS, you get similar tables for these independent variables. You can see that "pause" is the following environment most favoring deletion, and that deletion is much less common (28.5%) in the highest social class than in all the other social classes (all > 50%). Personally, the biggest surprise to me so far is that deletion is less common for a following-consonant environment than for a following-vowel environment, but hey, I'm not a phonologist. Also, we can look to see whether this effect is maybe an artifact... 6) We can also create nested tables, which helps us look at the data in a bit more detail. Try this: choose Crosstabs again, and then put Environment in the "Rows" box, then put POS in the "Layer" box. When you click OK, you will get a nested table. For every part of speech, you can conveniently compare the percentages of deletion for each environment. You can see that for plural markers on adjectives and nouns, and for monomorphemic words, following consonants seem to discourage deletion, but for verbs and determiners, following environment doesn't seem to have much of an effect. In the next few lectures we'll learn how to test more rigorously whether differing environment makes a difference for each part of speech, and vice versa.