Education 161 Winter 2000 Assignment 2 Solutions Feb 1, 2000 1. First order of business is to read in outcomes and set up data file. MTB > read '[data from file]' c1 36 ROWS READ C1 26 23 28 19 . . . **So far so good**** Now we can use the SET command to construct row and column indices. or use set patterned data in MT menu **set up row index***** MTB > set c2 DATA> (1:2)18. DATA> end **let's look at it*** MTB > print c2 C2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 **set up col index*** MTB > set c3 DATA> 2(1:3)6 DATA> end MTB > print c3 C3 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 a. For profile plot get cell means: MTB > table c2 c3; SUBC> stats c1. ROWS: C2 COLUMNS: C3 1 2 3 ALL 1 6 6 6 18 23.167 28.333 12.833 21.444 3.971 4.320 4.792 7.801 2 6 6 6 18 20.500 25.000 32.833 26.111 4.135 2.898 3.430 6.201 ALL 12 12 12 36 21.833 26.667 22.833 23.778 4.108 3.916 11.175 7.337 CELL CONTENTS -- C1:N MEAN STD DEV I won't try to draw here. Main effects of both Time of Isolation and Level of Reinforcement are best interpreted keeping in mind the disordinal interaction indicated in the profile plot. The profile plot indicates that recall increases steadily with time in isolation only for verbally reinforced children. For unreinforced children, recall increases from 20 to 40 minutes of isolation, but decreases from the 40 minute level (and falls below the 20 minute level) for 60 minutes of isolation. see Hopkins and Glass (HG) 18.2-18.5 b. The model for these data is: y(ijk) = mu + alpha(i) + beta(j) + alphabeta (ij) + epsilon(ijk) where mu is the grand mean of all observations. y(ijk) is recall for child k observed within the group defined by reinforcement level i and isolation level j. alpha(i) is the effect of reinforcement at level i. beta(j) is the effect of isolation at level j. alphabeta (ij) is the effect of the interaction between reinforcement at level i and isolation at level j . epsilon(ijk) is a random error see HG 18.8 part c. The ANOVA table below indicates significant main effects and interaction between the two factors (Time of Isolation and Level of Reinforcement) using overall error rate .05. MTB > anove c1 = c2|c3 ***Note: a wonderful feature of Minitab is that it executes misspelled commands as long as uniquely determinable***** Factor Type Levels Values C2 fixed 2 1 2 C3 fixed 3 1 2 3 Analysis of Variance for C1 Source DF SS MS F P C2 1 196.00 196.00 12.42 0.001 C3 2 156.22 78.11 4.95 0.014 C2*C3 2 1058.67 529.33 33.55 0.000 Error 30 473.33 15.78 Total 35 1884.22 To get the critical values for the series of 3 hypothesis tests, we use minitab MTB > invcdf .983; SUBC> f 2 30. 0.9830 4.6817 MTB > invcdf .983; SUBC> f 1 30. 0.9830 6.3871 This gives us each test at .017 (c.f. alphatot.tab) We are able to reject the main effects and interaction null hypotheses. see HG 18.17 ------------------------------------------------------------------ 2. part (f) Note: remember, for Tukey we don't need to worry splitting up the familywise error rate width = 5 = 2*q(.95,I,dfw)*sqrt(MSW/n) = 2*q(.95,3,dfw)*sqrt(19.8/n) using MSW from previous analysis a little algebra... 2.5 = q(.95,3,dfw)*sqrt(19.8/n) 6.25 = [q(.95,3,dfw)]^2 * 19.8/n n = 3.168 * [q(.95,3,dfw)]^2 Problem is, we don't know exactly what q is without knowing n, because q depends on degrees of freedom within. So we should use any prior information we have to suggest a best guess. The widths of the intervals in our previous analysis were around 11. Since we want to cut this width approximately in half, we'll need to quadruple (approximately) our group sample sizes. So start with n=40 as a best guess, which gives dfw =120-3 =117. Using Table values, q(.95,3,117) = 3.36 (approx.). Therefore n = 3.168 * (3.36)^2 = 36--i.e., 36 subjects in each of the three groups--which is pretty close to our original guess. Anything reasonably close to this number is acceptable. --------------------------------------------------- 3. (i) So first let's take a look at the descriptive statistics for these data (five groups, single classification. MTB > desc c1-c5 N MEAN MEDIAN TRMEAN STDEV SEMEAN C1 5 39.40 36.00 39.40 7.92 3.54 C2 5 44.20 41.00 44.20 10.40 4.65 C3 5 52.0 62.0 52.0 26.0 11.6 C4 5 40.80 43.00 40.80 10.13 4.53 C5 5 53.20 52.00 53.20 17.48 7.82 MIN MAX Q1 Q3 C1 32.00 51.00 33.00 47.50 C2 32.00 56.00 35.00 55.00 C3 10.0 75.0 27.5 71.5 C4 30.00 55.00 31.00 49.50 C5 35.00 80.00 38.00 69.00 The standard deviations range from 8 to 26, representing variances ranging from 64 to 680, a range of 10 to 1 (rather non-equal). Now, one consideration is that sample sizes are very small (so we cannot assume these differences are highly significant) and also since sample sizes are equal, we don't need to worry much about the effects of unequal variances on the one-way anova tests. We could, to carry this further, look at the Brown-Forsythe or Welch alternatives to standard anova (done in part ii) see HG Table 15.3 ii. Standard one-way anova (unstacked data). F-test for the null hypothesis of equal group means across the 5 groups (against the alternative of unequal means) has a miniscule value .81. Compare with critical value based on F(4,20) (F.95(4,20) = 2.87) so we cannot reject the null hypothesis of equal group means. MTB > aovoneway c1-c5 ANALYSIS OF VARIANCE SOURCE DF SS MS F p FACTOR 4 808 202 0.81 0.536 ERROR 20 5016 251 TOTAL 24 5824 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV ----------+---------+---------+------ C1 5 39.40 7.92 (-----------*-----------) C2 5 44.20 10.40 (-----------*-----------) C3 5 52.00 25.97 (-----------*------------) C4 5 40.80 10.13 (-----------*-----------) C5 5 53.20 17.48 (-----------*------------) ----------+---------+---------+------ POOLED STDEV = 15.84 36 48 60 (iii). For comparison, we are asked to try out the non-parametric alternative to the standard one-way anova, Kruskal-Wallis. We need to put the data in stacked form to carry this out. MTB > stack c1 c2 c3 c4 c5 into c6; SUBC> subscripts c7. MTB > kruskal-wallis c6 c7 LEVEL NOBS MEDIAN AVE. RANK Z VALUE 1 5 36.00 9.5 -1.19 2 5 41.00 12.3 -0.24 3 5 62.00 17.0 1.36 4 5 43.00 10.1 -0.99 5 5 52.00 16.1 1.05 OVERALL 25 13.0 H = 4.32 d.f. = 4 p = 0.366 H = 4.33 d.f. = 4 p = 0.364 (adj. for ties) And turning to the chi-square with (5-1) degrees of freedom (see NWK 18.7) the critical value is 9.49 (type I error rate .05). The test statistic H is 4.3. So we do not reject Ho, just as with the parametric anova on raw data (or even transformed data if we had tried to stabilize variance). see HG 15.30 ----------------------------------------------- 4. a. Interpretation of contrasts Contrast 1: Is the average learning outcome score of animals given both food and water different from the average score of animals deprived of one or both substances? Contrast 2: For animals given both food and water, does it make a difference whether they receive it "ad lib" versus twice a day? Contrast 3: Does the average learning of animals deprived of either food or water differ from the learning of animals deprived of both? Contrast 4: Does the effect of being deprived of food differ from the effect of being deprived of water? see HG 17.8 but especially 17.9 b. Verifying orthogonality First, write out the coefficients for all 4 contrasts: Group 1 2 3 4 5 C1: 1/2 1/2 -1/3 -1/3 -1/3 C2: 1 -1 0 0 0 C3: 0 0 1/2 1/2 -1 C4: 0 0 1 -1 0 To verify orthogonality, multiply corresponding coefficients for a pair of contrasts and add them up. If the pair is orthogonal, this sum will be zero. Do this for each of the 3-choose-2 = 6 pairs. C1 & C2: sigma(a[i]b[i]) = (1/2)(1)+(1/2)(-1)+0+0+0=0 C1 & C3: 0+0+(-1/3)(1/2)+(-1/3)(1/2)+(-1/3)(-1)=0 C1 & C4: 0+0+(-1/3)(1)+(-1/3)(-1)=0 C2 & C3: 0+0+0+0+0=0 C2 & C4: 0+0+0+0+0=0 C3 & C4: 0+0+(1/2)(1)+(1/2)(-1)+0=0 So all contrasts are orthogonal with one another. And since we have 5-1=4 degrees of freedom between, we have a full set of orthogonal contrasts here. see HG 17.16 c. First run the ANOVA, which will give us the sample group means and other things we'll need later. MTB > read '/usr/class/ed257/96hw1p5.dat' c1-c5 5 ROWS READ ROW C1 C2 C3 C4 C5 1 18 20 6 15 12 2 20 25 9 10 11 3 21 23 8 9 8 4 16 27 6 12 13 . . . MTB > aovoneway c1-c5 ANALYSIS OF VARIANCE SOURCE DF SS MS F p FACTOR 4 816.00 204.00 36.43 0.000 ERROR 20 112.00 5.60 TOTAL 24 928.00 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV -+---------+---------+---------+----- C1 5 18.000 2.550 (---*---) C2 5 24.000 2.646 (---*---) C3 5 8.000 2.121 (--*---) C4 5 12.000 2.550 (---*---) C5 5 11.000 1.871 (--*---) -+---------+---------+---------+----- POOLED STDEV = 2.366 6.0 12.0 18.0 24.0 Point estimate for contrast, denoted as l-hat is a[1]X-bar[1]+...+a[I]X-bar[I] Use the sample means above to construct point estimates for our 4 contrasts: l-hat[1] = (1/2)(18)+(1/2)(24)+(-1/3)(8)+(-1/3)(12)+(-1/3)(11) = 10.67 l-hat[2] = (1)(18)+(-1)(24) = -6 l-hat[3] = (1/2)(8)+(1/2)(12)+(-1)(11) = -1 l-hat[4] = (1)(8)+(-1)(12) = -4 d. Looking up alpha(total) = .10 and C=4 on alphatot.tab gives an individual Type I error rate of .0259963. So our t critical value is t(20, .9871). To find the actual value of t, Minitab gives us: MTB > invcdf .9871; SUBC> t 20. 0.9871 2.4082 Interval estimates: l-hat[1]: 10.67 +/- 2.4082*sqrt(5.6(1/4+1/4+1/9+1/9+1/9)/5) = [8.34, 12.99] l-hat[2]: -6 +/- 2.4082*sqrt(5.6(1+1)/5) = [-9.60, -2.40] l-hat[3]: -1 +/- 2.4082*sqrt(5.6(1/4+1/4+1)/5) = [-4.12, 2.12] l-hat[4]: -4 +/- 2.4082*sqrt(5.6(1+1)/5) = [-7.60, -.40] Only one of these intervals, the one for contrast 3, contains zero. So we conclude that the effects of food & water versus some deprivation (l-hat[1]), timing of receiving food & water (l-hat[2]), and being deprived of food versus water (l-hat[4]) do make a difference on our learning outcome measure, whereas being deprived of either food or water versus being deprived of both does not matter. Note that the interval for the 4th contrast comes pretty close to 0 but does not contain it. see HG 17.17, and formula 17.8A-C -------------------------------------------- 5. "More on interactions..." Preamble: This problem was constructed to address the question, What more can be done about describing (or drawing inferences) about the interaction terms beyond just (rejecting or not) the omnibus null hypothesis of no interaction? Previously, we had discussed the importance of the profile plot as the major descriptive technique. A stategy for following the graphical display involves estimating the row effects separately for each level of the column factor. The college mathematics learning example (a 2x3 design) that was described in class is a good template for this example. Note: In the output below the columns c1-c4 in the data file extraint.dat are labelled as c10-c13--that is outcome in c10, row,column indicators in c12,c13. Columns c1-c6 here contain the data for each cell in the 2x3 design. The data (5 replications in the 2x3 design) were generated with within-cell variance of 9 and cell means 12 15 18 10 11 12 Here's how I did it, putting each cell in its own column. Generate the data: MTB > random 5 c1; SUBC> normal 12 3. MTB > random 5 c2; SUBC> normal 15 3. MTB > random 5 c3; SUBC> normal 18 3. MTB > random 5 c4; SUBC> normal 10 3. MTB > random 5 c5; SUBC> normal 11 3. MTB > random 5 c6; SUBC> normal 12 3. Here's the data with each cell of the 2x3 in it's own column MTB > print c1-c6 ROW C1 C2 C3 C4 C5 C6 1 8.1988 20.3003 17.1762 9.5724 9.7497 10.3652 2 8.7541 16.3191 18.7042 11.1114 6.6897 12.2657 3 14.5011 14.4865 16.4200 13.2167 13.9392 9.7098 4 8.0545 12.3120 17.5973 11.9217 14.0845 11.9133 5 7.8095 14.6511 21.7529 11.6024 7.9765 10.0781 ------------------------------ Let's start the analysis Describe two-way data using the stacked form that you have in extraint.dat MTB > table c12 c13; SUBC> stats c10. ROWS: C12 COLUMNS: C13 1 2 3 ALL 1 5 5 5 15 9.464 15.614 18.330 14.469 2.837 2.982 2.084 4.563 2 5 5 5 15 11.485 10.488 10.866 10.946 1.323 3.396 1.147 2.086 ALL 10 10 10 30 10.474 13.051 14.598 12.708 2.343 4.047 4.241 3.919 CELL CONTENTS -- C10:N MEAN STD DEV Obtain anova table MTB > twoway c10 c12 c13 ANALYSIS OF VARIANCE C10 SOURCE DF SS MS C12 1 93.07 93.07 C13 2 86.80 43.40 INTERACTION 2 122.10 61.05 ERROR 24 143.52 5.98 TOTAL 29 445.49 Clearly, the interaction is significant, as are the main effects. Profile plot based on sample data will show marked interaction which actually appears disordinal even though for the population cell means the interaction is ordinal. First, let's compare row cell means at levels of the column (1,2,3 respectively) by a series of two-sample t inferences. First the default 95% interval produced by twosample is shown. This is most likely what is commonly done in the literature when such a comparison is attempted. We know that MUCH better practice would be to specify (following Bonferroni) a set of 98.5% intervals to control the overall confidence coefficient to approx 95%. Here I have my original data set with each cell in its own column; you can get there by unstacking extraint.dat. Comparing the two rows at the first level of the column factor: MTB > twosample c1 c4 TWOSAMPLE T FOR C1 VS C4 N MEAN STDEV SE MEAN C1 5 9.46 2.84 1.3 C4 5 11.48 1.32 0.59 95 PCT CI FOR MU C1 - MU C4: (-5.6, 1.58) TTEST MU C1 = MU C4 (VS NE): T= -1.44 P=0.21 DF= 5 Same for the second level of the column factor: MTB > twosample c2 c5 TWOSAMPLE T FOR C2 VS C5 N MEAN STDEV SE MEAN C2 5 15.61 2.98 1.3 C5 5 10.49 3.40 1.5 95 PCT CI FOR MU C2 - MU C5: (0.3, 9.9) TTEST MU C2 = MU C5 (VS NE): T= 2.54 P=0.039 DF= 7 And for the third level of the column factor: MTB > twosample c3 c6 TWOSAMPLE T FOR C3 VS C6 N MEAN STDEV SE MEAN C3 5 18.33 2.08 0.93 C6 5 10.87 1.15 0.51 95 PCT CI FOR MU C3 - MU C6: (4.86, 10.07) TTEST MU C3 = MU C6 (VS NE): T= 7.02 P=0.0004 DF= 6 An interesting note on the above is that the interval estimate at the second level of the column factor will include 0 for any reasonable use of an overall confidence coefficient of 95% (i.e. p-value =~ .04) ------ Now let's use a more appropriate confidence coefficient considering we are doing 3 of these--98.3%, which is close to Bonferroni with overall 95% for the 3. (I checked that Minitab really does do .983 confidence here, not just .98 as the output indicates). The confidence intervals are as you would expect; the second includes 0. MTB > twosample .983 c1 c4. TWOSAMPLE T FOR C1 VS C4 N MEAN STDEV SE MEAN C1 5 9.46 2.84 1.3 C4 5 11.48 1.32 0.59 98 PCT CI FOR MU C1 - MU C4: (-6.9, 2.90) TTEST MU C1 = MU C4 (VS NE): T= -1.44 P=0.21 DF= 5 MTB > twosample .983 c2 c5. TWOSAMPLE T FOR C2 VS C5 N MEAN STDEV SE MEAN C2 5 15.61 2.98 1.3 C5 5 10.49 3.40 1.5 98 PCT CI FOR MU C2 - MU C5: (-1.2, 11.4) TTEST MU C2 = MU C5 (VS NE): T= 2.54 P=0.039 DF= 7 MTB > twosample .983 c3 c6. TWOSAMPLE T FOR C3 VS C6 N MEAN STDEV SE MEAN C3 5 18.33 2.08 0.93 C6 5 10.87 1.15 0.51 98 PCT CI FOR MU C3 - MU C6: (3.98, 10.95) TTEST MU C3 = MU C6 (VS NE): T= 7.02 P=0.0004 DF= 6 ---------------------------- Now Compare with pairwise intervals from Tukey with family-wise 95% confidence Here I use the stacked data in the form you have it in extraint.dat. Oneway is run on the outcome and the indicator of group (1,...6) MTB > oneway c10 c11; SUBC> tukey. ANALYSIS OF VARIANCE ON C10 SOURCE DF SS MS F p C11 5 301.97 60.39 10.10 0.000 ERROR 24 143.52 5.98 TOTAL 29 445.49 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV --+---------+---------+---------+---- 1 5 9.464 2.837 (-----*----) 2 5 15.614 2.982 (-----*-----) 3 5 18.330 2.084 (-----*----) 4 5 11.485 1.323 (-----*----) 5 5 10.488 3.396 (----*-----) 6 5 10.866 1.147 (----*-----) --+---------+---------+---------+---- POOLED STDEV = 2.445 8.0 12.0 16.0 20.0 Tukey's pairwise comparisons Family error rate = 0.0500 Individual error rate = 0.00498 Critical value = 4.37 Intervals for (column level mean) - (row level mean) 1 2 3 4 5 2 -10.929 -1.371 3 -13.646 -7.496 -4.087 2.063 4 -6.801 -0.650 2.066 2.758 8.908 11.624 5 -5.803 0.347 3.063 -3.782 3.755 9.905 12.621 5.776 6 -6.182 -0.032 2.685 -4.161 -5.158 3.376 9.527 12.243 5.398 4.401 The 2,5 entry is the most interesting. Tukey provides an interval that does not include 0 with family-wise confidence coeff 95% that about matches the two-sample based interval having a far lower confidence coeff for the set of 3 intervals. ------------------------------------------ Finally, another approach is to construct comparisons using the Bonferroni method. The potential advantage of this over the Tukey results above is that we can limit to just the 3 comparisons that we seek. Point estimates: D-hat[1] = 9.464-11.485 = -2.021 D-hat[2] = 15.614 - 10.488 = 5.126 D-hat[3] = 18.330-10.866 = 7.464 Var(D-hat) = 2*MSW/n = 2(5.98)/5 = 2.392 B = t(1-.05/6, 24) = t(.9917,24) = 2.5754 Therefore the width of each interval is 2*2.5754*sqrt(2.392) = 2*3.98 = 7.96 Intervals MU C1 - MU C4: (-2.021-3.98, -2.021+3.98) = (-6.00,1.96) MU C2 - MU C5: (1.15,9.11) MU C3 - MU C6: (3.48,11.44) As with the Tukey intervals, only the interval for MU C1 - MU C4 contains zero. Notice that these intervals are somewhat narrower than the Tukey intervals (7.96 versus 9.56 width). Since we're only interested in 3 of the 16 possible pairwise comparisons, the Bonferroni method appears to be an improvement over Tukey. --------------------------- Here's a summary table for the comparison of the rows at each of the 3 levels of the column factor. 2-sample t Tukey Bonferroni t-int .983 overall .95 overall .95 ------- ------ -------- CI FOR MU C1 - MU C4: (-6.9, 2.90) (-6.801, 2.758) (-6.00,1.96) CI FOR MU C2 - MU C5: (-1.2, 11.4) ( 0.347, 9.905) (1.15,9.11) CI FOR MU C3 - MU C6: (3.98, 10.95) ( 2.685, 12.243) (3.48,11.44) ---------------------------------------------------------------------------- problem 6 solution Hopkins and Glass Problems 1 and 2 on page 527 First things first: We have to enter the data into a useful data structure. Having entered the vocab scores into c1, we need to create group membership variables for each of the factors. One way is to manually enter 12 ones then 12 twos in into c2 and name it IQ. Then enter the approprate sequence of ones, twos, and threes in c3 and name it Method. For small datasets like this, this is an acceptable approach, but it becomes quite tedious and time consuming with large datasets. Below you will find command that help you with this process. To create a series of 12 ones and then 12 twos in a column named IQ, we do this either from the command line or from the Calc...'make patterned data' menu item: MTB > Name c2 = 'IQ' MTB > Set 'IQ' DATA> 1( 1 : 2 / 1 )12 DATA> End. Note that your use of this procedure depend on the order in which you entered the vocab scores. To make a repeating series of 4 ones, 4 twos, and 4 threes in a column named method, we do this: MTB > Name c3 = 'Method' MTB > Set 'Method' DATA> 2( 1 : 3 / 1 )4 DATA> End. First part of data analysis is to look at at the cell means and ploting them by factors. MTB > table c2 c3; SUBC> mean c1. Tabulated Statistics Rows: IQ Columns: Method 1 2 3 All 1 31.000 23.000 24.000 26.000 2 29.000 18.000 19.000 22.000 All 30.000 20.500 21.500 24.000 >From this we might presume that there is a column effect, a possible row effect, and probably no interaction effect. Typing the following commands will giveyou a nice plot of cell means by factors. This is also avaliable from the Stats..ANOVA.. Interaction Plot Menu. which show the same features as examination of cell means MTB > %Interact 'IQ' 'Method'; SUBC> Response 'vocab'. Now, we can run a two-way ANOVA in a number of different ways. As long as the cells are balanced, we might as well use the anova command, although we could use 'twoway' or another technique we haven't talked about... glm. Here is what the Minitab help file has to say about our options: "Two-way analysis of variance performs an analysis of variance for testing the equality of populations means when classification of treatments is by two variables or factors.Data must be balanced (all cells must have the same number of observations) and factors must be fixed. If you wish to specify certain factors to be random, use Balanced ANOVA if your data are balanced; use General Linear Models if your data are unbalanced or if you wish to compare means using multiple comparisons." Using anova: MTB > ANOVA 'vocab' = IQ| Method Analysis of Variance (Balanced Designs) Factor Type Levels Values IQ fixed 2 1 2 Method fixed 3 1 2 3 Analysis of Variance for vocab Source DF SS MS F P IQ 1 96.00 96.00 4.11 0.058 Method 2 436.00 218.00 9.34 0.002 IQ*Method 2 12.00 6.00 0.26 0.776 Error 18 420.00 23.33 Total 23 964.00 What sense do we make out of the 3 F values? If we want an experiment-wide type I error rate of about .05, then the alpha used to calculate the critical f-value for each hypothesis could be .05/3 = .0167 using the Bonferroni inequality. (As mention in class, there are other ways of controlling type one error when several hypotheses are tested. We could have just as easily decided to use a .02 alpha for each of the main factors and alpha = .01 for the test of interaction hypothesis. To be fair, all of this should be done before you see the results of the test.) To test the null hypothesis that IQ doesn't matter, we got F(1, 18) = 4.11. MTB > InvCDF .9833; (this number is 1-alpha) SUBC> F 1 18. Inverse Cumulative Distribution Function F distribution with 1 DF in numerator and 18 DF in denominator P( X <= x) x 0.9833 6.9601 Thus F < Fcrit, and we fail to reject the hypothesis that IQ doesn't matter. Note that this hypothesis can be stated more formally in terms of factor means(all equal) ,or main effects (all zero). To test the null hypothesis that Method doesn't matter, we got F(2, 18) = 9.34. MTB > InvCDF .9833; SUBC> F 2 18. Inverse Cumulative Distribution Function F distribution with 2 DF in numerator and 18 DF in denominator P( X <= x) x 0.9833 5.1814 Thus F > Fcrit and we reject the null hypothesis that method doesn't matter and accept the alternative that method does matter. Put more formally, the treatment effects for levels of method do not all equal zero. Lastly, we can use the same logic and procedures to fail to reject the hypotheses that there is no interaction effect. (F = .26 < Fcrit = 5.1814) see HG 18.6, 18.17, 18.19 Question 2) To get mutilple comparisons for factorial designs using minitab, we have to use the Generalized Linear Model (GLM) function. For balance designs, glm should give us virtually the same results as the anova command. See below that the F values we get for our 3 hypotheses are the same as we got from anova. MTB > GLM 'vocab' = IQ| Method; SUBC> Brief 2; SUBC> Pairwise Method; SUBC> Tukey. General Linear Model Analysis of Variance for vocab, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P IQ 1 96.00 96.00 96.00 4.11 0.058 Method 2 436.00 436.00 218.00 9.34 0.002 IQ*Method 2 12.00 12.00 6.00 0.26 0.776 Error 18 420.00 420.00 23.33 Total 23 964.00 Tukey 95.0% Simultaneous Confidence Intervals Response Variable vocab All Pairwise Comparisons among Levels of Method Method = 1 subtracted from: Method Lower Center Upper ---+---------+---------+---------+--- 2 -15.67 -9.500 -3.335 (-------*--------) 3 -14.67 -8.500 -2.335 (--------*--------) ---+---------+---------+---------+--- -14.0 -7.0 0.0 7.0 Method = 2 subtracted from: Method Lower Center Upper ---+---------+---------+---------+--- 3 -5.165 1.000 7.165 (-------*--------) ---+---------+---------+---------+--- -14.0 -7.0 0.0 7.0 >From these 95% CIs for the difference between the means of the 3 levels of Method, we can see that method 1 is significantly different from methods 2 and 3, and the method 2 is not significnatly different from method 3. GLM also gives us the T statistics for each of the pariwise comparisons, as presented below. Tukey Simultaneous Tests Response Variable vocab All Pairwise Comparisons among Levels of Method Method = 1 subtracted from: Level Difference SE of Adjusted Method of Means Difference T-Value P-Value 2 -9.500 2.415 -3.933 0.0027 3 -8.500 2.415 -3.519 0.0066 Method = 2 subtracted from: Level Difference SE of Adjusted Method of Means Difference T-Value P-Value 3 1.000 2.415 0.4140 0.9103 By hand, we can get the intervals by calculating Tukey's HSD (Alpha = .05): (3.61)(1.71) = 6.17 (see H&G, p.525 for notation) So, each interval can be constructed by the point estimate plus and minus 6.17. For example, mean(method1)-mean(method2) = -9.5, plus and minus 6.17 gives the interval (-15.67, -3.355). Same answer as above.