Education 161 Winter 2000 Assignment 1 Due Jan 18,2000 Note data files are available in one of two locations: path: /usr/class/ed161/[data file] or using web-services at URL http://www.stanford.edu/class/ed161/hw/[data file] ----------------------------------- 1. The 179 participants in the Cartoon experiment--description below, data in cartoon.dat--each saw cartoon and realistic slides. (a) construct an approrpiate statistical test to see if there is any difference between the scores on the two types of slides each person obtained immediately after presentation. (b) construct a 90% confidece interval for the difference between the these scores on the two types of slides. ------------ Cartoon Data in cartoon.dat When educators make an instructional film, they have two objectives: Will the people who watch the film learn the material as efficiently as possible? Will they retain what they have learned? To help answer these questions, an experiment was conducted to evaluate the relative effectiveness of cartoon sketches and realistic pho- tographs, in both color and black and white visual materials. A short instructional slide presentation was developed. The topic chosen for the presentation was the behavior of people in a group situation and, in particular, the various roles or character types that group members often assume. The presentation consisted of a five-minute lecture on tape, accompanied by 18 slides. Each role was identified as an animal. Each animal was shown on two slides: once in a cartoon sketch and once in a realistic picture. All 179 participants saw all of the 18 slides. but a randomly selected half of the participants saw them in black and white while the other half saw them in color. After they had seen the slides, the participants took a test (immediate test) on the material. The 18 slides were presented in a random order, and the participants wrote down the character type represented by that slide. They received two scores: one for the number of cartoon characters they correctly identified and one for the number of realistic characters they correctly identified. Each score could range from 0 to 9, since there were nine characters. Four weeks later, the participants were given another test (delayed test) and their scores were computed again. Some participants did not show up for this delayed test, so their scores were given the missing value code *. The primary participants in this study were preprofessional and professional personnel at three hospitals in Pennsylvania involved in an in-service training program. A group of Penn State undergraduate students also were given the test as a comparison. All participants were given the OTIS Quick Scoring Mental Ability Test, which yielded a rough estimate of their natural ability. Some questions that are of interest here are as follows: Is there a difference between color and black and white visual aids? Between cartoon and realistic? Is there any difference in retention? Does any difference depend on educational level or location? Does adjusting for OTIS scores make any difference? The data are given below. They have been sorted partially so that various parts may easily be studied separately. Description of Cartoon Data Variable Description C1 ID Identification number C2 COLOR 0 = black and white, 1 = color (no participant saw both) C3 ED Education: 0 = preprofessional, I = professional. 2 = college student C4 LOCATION Location: I = hospital A, 2 = hospital B. 3 = hospital 4 = Penn State student C5 OTIS Score: from about 70 to about 130 C6 CARTOON1 Score on cartoon test given immediately after presenta tion (possible scores are 0, 1, 2, ..., 9) C7 REAL1 Score on realistic test given immediately after presenta tion (possible scores are 0, 1, 2, ..., 9) C8 CARTOON2 Score on cartoon test given four weeks (delayed) after presentation (possible scores are 0, 1, 2, ..., 9; * is used for a missing observation) C9 REAL2 Score on realistic test given four weeks (delayed) after presentation (possible scores are 0, 1, 2, ..., 9; * is used for a missing observation) ============================================================================== 2. Data in the file rat.dat gives the drop in blood pressure for three groups of six rats from a strain of hypertensive rats. The six rats in the first group (C1) were treated with a low dose of an antihypertensive prod- uct, the second group (C2) with a higher dose of the same antihypertensive product, and the third group (C3) with an inert control. Note that the variability in blood pressure decreases, even for rats in the control group. Also note that negative values represent increases in blood pressure. Construct a 95% confidence interval for difference in population means between the low dose group and control group. Use Minitab with these data to construct the interval estimate, making no assumption about equality of the group variances. ============================================= 3. Complete the Anova Table given below. Also state and carry out a test of the omnibus null hypothesis with Type I error rate .10. SOURCE SS df MS Between 80 4 ** Within ** * ** Total 480 44 -------------------------------------------------------------------------- 4. Salary disputes and their eventual resolutions often leave both employer and employees embittered by the entire ordeal. To assess employee reactions to a recently devised salary and fringe benefits plan, the personnel department obtained random samples of 15 employees from each of three divisions: manufacturing, marketing, and research. Each employee sampled was asked to respond (in confidence) to a series of questions. Several employees refused to cooperate, as reflected in the unequal sample sizes. Some data summary is given below. Manufacturing Marketing Research Sample Size 12 14 11 Sample mean 25.2 32.6 28.1 Sample Variance 3.6 4.8 5.3 a. Write a model for this data structure b. Carry out an omnibus test of all three employee groups having equal population means using a standard one-way analysis of variance procedure. Use Type 1 error rate .01. ---------------------------------------------------------------------------- 5. I've had more knee operations than you've have had statistics courses.... A rehabililitation center researcher was interested in examining the relationship between physical fitness prior to surgery of persons undergoing corrective knee surgery and time required in physical therapy until sucessful rehabilitation. 24 male subjects ranging in age from 18 to 30 years who had undergone similar corrective knee surgery during the past year were selected for the study. In the data file knee.dat c1 contains the number of days required for sucessful completion of physical therapy and c2 contains an indicator of prior physical fitness status-- 1 = below average; 2 = average; 3 = above average. (So this data set is of the form of a time-to-mastery study.) a) obtain mean and variance of time to recovery for each group b) present a graphical look at the scores for the three groups by constucting aligned dotplots for the three groups c) carry out an anova for this one-way classification and test the omnibus null hypothesis of no differences between the group means using Type I error rate .05. d) display residuals from the fit of the anova model for each group. e) carry out the post-hoc pairwise comparison procedure in order to obtain interval estimates of each pairwise comparison using experimentwise error rate .05. ============================================================== 6. In the class materials, the file smsg.dat contains the data for the SMSG versus traditional mathematics instruction evaluation discussed in the first week of Ed257 (refer to Web site description). C1 is group (1=SMSG 2=trad.); C2 is class mean mathematics achievement. Carry out a single classification anova (here the classsification variable only has two levels) and show the equivalence to a pooled t-test. Unstacking these data and using aovoneway will avoid an error message from some Minitab versions about unequal group sizes. ---------------------------------------------------------------- 7. An experiment was conducted to compare the effectiveness of five different weight-reducing agents. A random sample of 50 males was randomly divided into five equal groups, with preparation A assigned to the first group, B to the second group, and so on. Each person in the experiment was given a prestudy physical and told how many pounds overweight he was. A comparison of the mean number of pounds overweight for the groups showed no significant differences. The study program was then begun, with each group taking the prescribed preparation for a fixed period of time. At the end of the study period, weight losses were recorded. The data for the 5 groups are given in columns 1-5 in file weightloss.dat in the class directory. a. Use standard one-way analysis of variance to carry out a test of the omnibus null hypothesis of equal effectiveness of the weight-reducing agents. Use Type I error rate .05 b. Construct interval estimates for all pairwise comparisons using the Tukey Method with family-wise confidence coeff .95. ------------------------------------------------------------- 8. (former in-class Quiz question) "Don't Sweat." An experimenter sought to determine the effects of different levels of anxiety on test scores. Thirty subjects were randomly assigned to one of the three levels (ten to each level) (1) low-anxiety, (2) moderate-anxiety, and (3) high-anxiety conditions. A person's score was the number of items answered correctly on the test. These scores are given below (C1-C3 in the minitab output): 1 2 3 Low Moderate High 26 49 51 50 52 53 34 74 50 48 64 77 46 61 33 60 39 56 48 51 28 71 54 63 42 53 47 42 58 59 From the (edited) Minitab output below: a. Provide the entries in the ANOVA table overwritten by 'a' 'b' 'c' . b. Carry out a test of the omnibus null hypothesis for the equality of treatment means using Type I error rate .10. Provide the details of the test. c. For the output from the Tukey multiple comparisons procedure provide the values written over by 'd' and 'e'. MTB > describe c1-c3 N MEAN MEDIAN TRMEAN STDEV C1 10 48.40 48.50 48.00 13.33 C2 10 48.00 49.00 47.62 12.26 C3 10 57.50 57.00 57.37 9.79 MTB > stack c1 c2 c3 c10; SUBC> subscripts c11. MTB > oneway c10 c11; SUBC> tukey .10. ANALYSIS OF VARIANCE ON C10 SOURCE DF SS MS C11 aa bbbb ccc ERROR 27 3813 141 TOTAL 29 4390 POOLED STDEV = 11.88 Tukey's pairwise comparisons Family error rate = 0.100 Individual error rate = 0.0413 Intervals for (column level mean) - (row level mean) 1 2 2 -10.99 11.79 3 -20.49 dddddd 2.29 eeeeee (of course if you like you can recreate this entire output using the data in the problem)