Some benchmarks for bodyfat levels 9/20/93 CNN reports US Army bodyfat limit for discharge Women Men 30% 20% 5/19/93 SJ Merc Fitness Level Top athletes 11-19% 5-12% Good Fitness 20-25% 13-17% US average 26-30% 18-25% Unfit >=31% >=26% ----------------------------------------- MTB > read 'bodyfat.dat' c1-c4 20 ROWS READ ROW C1 C2 C3 C4 1 19.5 43.1 29.1 11.9 2 24.7 49.8 28.2 22.8 3 30.7 51.9 37.0 18.7 4 29.8 54.3 31.1 20.1 . . . ***** Table 8.1 NWK******************** MTB > corr c1-c4 C1 C2 C3 C2 0.924 C3 0.458 0.085 C4 0.843 0.878 0.142 ***************************NWK Table 8.2********************* MTB > brief 1 MTB > regress c4 1 c1 The regression equation is C4 = - 1.50 + 0.857 C1 Predictor Coef Stdev t-ratio p Constant -1.496 3.319 -0.45 0.658 C1 0.8572 0.1288 6.66 0.000 s = 2.820 R-sq = 71.1% R-sq(adj) = 69.5% Analysis of Variance SOURCE DF SS MS F p Regression 1 352.27 352.27 44.30 0.000 Error 18 143.12 7.95 Total 19 495.39 MTB > regress c4 1 c2 The regression equation is C4 = - 23.6 + 0.857 C2 Predictor Coef Stdev t-ratio p Constant -23.634 5.657 -4.18 0.001 C2 0.8565 0.1100 7.79 0.000 s = 2.510 R-sq = 77.1% R-sq(adj) = 75.8% Analysis of Variance SOURCE DF SS MS F p Regression 1 381.97 381.97 60.62 0.000 Error 18 113.42 6.30 Total 19 495.39 MTB > regress c4 1 c3 The regression equation is C4 = 14.7 + 0.199 C3 Predictor Coef Stdev t-ratio p Constant 14.687 9.096 1.61 0.124 C3 0.1994 0.3266 0.61 0.549 s = 5.193 R-sq = 2.0% R-sq(adj) = 0.0% Analysis of Variance SOURCE DF SS MS F p Regression 1 10.05 10.05 0.37 0.549 Error 18 485.34 26.96 Total 19 495.39 MTB > regress c4 2 c1 c2 The regression equation is C4 = - 19.2 + 0.222 C1 + 0.659 C2 Predictor Coef Stdev t-ratio p Constant -19.174 8.361 -2.29 0.035 C1 0.2224 0.3034 0.73 0.474 C2 0.6594 0.2912 2.26 0.037 s = 2.543 R-sq = 77.8% R-sq(adj) = 75.2% Analysis of Variance SOURCE DF SS MS F p Regression 2 385.44 192.72 29.80 0.000 Error 17 109.95 6.47 Total 19 495.39 MTB > brief 3 MTB > regress c4 3 c1 c2 c3 c5 c6; SUBC> residuals c7. * NOTE * C1 is highly correlated with other predictor variables * NOTE * C2 is highly correlated with other predictor variables * NOTE * C3 is highly correlated with other predictor variables The regression equation is C4 = 117 + 4.33 C1 - 2.86 C2 - 2.19 C3 Predictor Coef Stdev t-ratio p Constant 117.08 99.78 1.17 0.258 C1 4.334 3.016 1.44 0.170 C2 -2.857 2.582 -1.11 0.285 C3 -2.186 1.595 -1.37 0.190 s = 2.480 R-sq = 80.1% R-sq(adj) = 76.4% Analysis of Variance SOURCE DF SS MS F p Regression 3 396.98 132.33 21.52 0.000 Error 16 98.40 6.15 Total 19 495.39 SOURCE DF SEQ SS C1 1 352.27 C2 1 33.17 C3 1 11.55 Obs. C1 C4 Fit Stdev.Fit Residual St.Resid 1 19.5 11.900 14.855 1.449 -2.955 -1.47 2 24.7 22.800 20.219 0.981 2.581 1.13 3 30.7 18.700 20.987 1.646 -2.287 -1.23 4 29.8 20.100 23.127 0.832 -3.027 -1.30 5 19.1 12.900 11.758 1.490 1.142 0.58 6 25.6 21.700 22.244 0.899 -0.544 -0.24 7 31.4 27.100 25.714 1.093 1.386 0.62 8 27.9 25.400 22.271 1.005 3.129 1.38 9 22.1 21.300 19.595 1.089 1.705 0.77 10 25.5 19.300 20.548 1.216 -1.248 -0.58 11 31.1 25.400 24.596 0.926 0.804 0.35 12 30.4 27.200 24.992 0.820 2.208 0.94 13 18.7 11.700 15.009 1.146 -3.309 -1.50 14 19.7 17.800 13.672 1.076 4.128 1.85 15 14.6 12.800 11.812 1.464 0.988 0.49 16 29.5 23.900 23.727 0.839 0.173 0.07 17 27.7 22.600 22.974 0.878 -0.374 -0.16 18 30.2 25.400 26.786 1.185 -1.386 -0.64 19 22.7 14.800 18.526 0.902 -3.726 -1.61 20 25.2 21.100 20.488 0.637 0.612 0.26 ***************MT ver 7, 7-1 - 7-13 ****************************** MTB > help regress REGRESS C on K predictors C,...,C [put stand. residuals & into C [put fits into C]] (Stat > Regression > Regression) Subcommands: NOCONSTANT RMATRIX COOKD DW WEIGHTS HI DFITS PURE MSE RESIDS PREDICT XLOF COEF TRESIDS VIF TOLERANCE XPXINV Fits the regression equation y = b0 + b1*X1 + b2*X2 + ... + bk*Xk to data in selected response and predictor variables. The response variable is stored in the first column. The number of predictors is given next followed by columns containing the predictors. If you give an additional column, the standardized residuals will be stored. If you give a second column, the fits will also be stored. The standardized residuals are ei/stdev(ei), where ei is the residual and stdev(ei) = SQRT (MSE - Var(Yhati)). The fitted value for i-th observation is: Yhati = b0 + b1X1 + ... + bkXk. To control the amount of printed output, use the command BRIEF. Missing Data All observations which contain one or more missing values (either in the dependent or one or more of the independent variables) are not used in any of the regression calculations, with two exceptions: If Yi is missing but all predictors are present (or if case i has a weight = 0 and all predictors are present), then Yhati is calculated, and hi is calculated as xi*(INV(X'X))*xi', where xi is the row vector of predictors for the i-th observation and X is the design matrix with the i-th observation deleted. Ill-conditioned data See the Minitab Reference Manual for a discussion on the handling of ill-conditioned data. Ill-conditioned data refers to cases where some predictors are highly correlated with other predictors, or when a predictor variable has a small coefficient of variation. The computational method used for regression is Givens transformations using Linpack routines. The method is described in Chapter 10 of the Linpack User's Guide. See the Minitab Reference Manual for reference listings. MTB > print c5 c7 ROW C5 C7 1 -1.46803 -2.95499 2 1.13327 2.58116 3 -1.23262 -2.28668 4 -1.29571 -3.02732 5 0.57630 1.14239 6 -0.23526 -0.54371 7 0.62250 1.38568 8 1.38023 3.12936 9 0.76530 1.70519 10 -0.57762 -1.24839 11 0.34965 0.80444 12 0.94324 2.20770 13 -1.50478 -3.30940 14 1.84716 4.12769 15 0.49353 0.98805 16 0.07393 0.17254 17 -0.16108 -0.37361 18 -0.63615 -1.38591 19 -1.61308 -3.72628 20 0.25538 0.61209 **********************NWK Table 11.2 ******************** MTB > note deleted residuals check MTB > regress c4 2 c1 c2 c11; SUBC> residuals c12; SUBC> tresiduals c13; SUBC> hi c14. The regression equation is C4 = - 19.2 + 0.222 C1 + 0.659 C2 Predictor Coef Stdev t-ratio p Constant -19.174 8.361 -2.29 0.035 C1 0.2224 0.3034 0.73 0.474 C2 0.6594 0.2912 2.26 0.037 s = 2.543 R-sq = 77.8% R-sq(adj) = 75.2% Analysis of Variance SOURCE DF SS MS F p Regression 2 385.44 192.72 29.80 0.000 Error 17 109.95 6.47 Total 19 495.39 SOURCE DF SEQ SS C1 1 352.27 C2 1 33.17 Obs. C1 C4 Fit Stdev.Fit Residual St.Resid 1 19.5 11.900 13.583 1.140 -1.683 -0.74 2 24.7 22.800 19.157 0.617 3.643 1.48 3 30.7 18.700 21.876 1.551 -3.176 -1.58 4 29.8 20.100 23.258 0.847 -3.158 -1.32 5 19.1 12.900 12.900 1.267 -0.000 -0.00 6 25.6 21.700 22.061 0.912 -0.361 -0.15 7 31.4 27.100 26.384 1.003 0.716 0.31 8 27.9 25.400 21.385 0.789 4.015 1.66 9 22.1 21.300 18.645 0.861 2.655 1.11 10 25.5 19.300 21.775 0.844 -2.475 -1.03 11 31.1 25.400 25.064 0.882 0.336 0.14 12 30.4 27.200 24.974 0.841 2.226 0.93 13 18.7 11.700 15.647 1.074 -3.947 -1.71 14 19.7 17.800 14.353 0.978 3.447 1.47 15 14.6 12.800 12.229 1.468 0.571 0.27 16 29.5 23.900 23.258 0.785 0.642 0.27 17 27.7 22.600 23.451 0.826 -0.851 -0.35 18 30.2 25.400 26.183 1.128 -0.783 -0.34 19 22.7 14.800 17.657 0.658 -2.857 -1.16 20 25.2 21.100 20.060 0.569 1.040 0.42 MTB > print c10-c14 ROW C10 C11 C12 C13 C14 1 -0.74023 -1.68271 -0.72998 0.201013 2 1.47658 3.64293 1.53425 0.058895 3 -1.57579 -3.17597 -1.65433 0.371933 4 -1.31715 -3.15846 -1.34848 0.110940 5 -0.00013 -0.00029 -0.00013 0.248010 6 -0.15199 -0.36082 -0.14755 0.128616 7 0.30645 0.71620 0.29813 0.155517 8 1.66061 4.01473 1.76009 0.096288 9 1.10955 2.65510 1.11765 0.114636 10 -1.03165 -2.47481 -1.03373 0.110244 11 0.14078 0.33581 0.13666 0.120337 12 0.92722 2.22551 0.92318 0.109266 13 -1.71215 -3.94686 -1.82590 0.178382 14 1.46861 3.44746 1.52476 0.148007 15 0.27476 0.57059 0.26715 0.333212 16 0.26552 0.64230 0.25813 0.095277 17 -0.35380 -0.85095 -0.34451 0.105595 18 -0.34350 -0.78292 -0.33441 0.196793 19 -1.16313 -2.85729 -1.17617 0.066954 20 0.41976 1.04045 0.40936 0.050085