* First I pasted the dataset from Excel into Stata's data editor.

. *(8 variables, 11 observations pasted into data editor)

save "C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pages\soc_meth_proj3\fall_2010_s381_logs\anscombe.dta"

file C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pages\soc_meth_proj3\fall_2010_s381_logs\anscombe.dta saved

* Then I saved it.

. twoway (scatter y2 x2) (lfit y2 x2)

* Then I plotted it, then regressed it.

. regress y2 x2

Source |       SS       df       MS              Number of obs =      11

-------------+------------------------------           F(  1,     9) =   17.97

Model |  27.5000024     1  27.5000024           Prob > F      =  0.0022

Residual |   13.776294     9  1.53069933           R-squared     =  0.6662

Total |  41.2762964    10  4.12762964           Root MSE      =  1.2372

------------------------------------------------------------------------------

y2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

x2 |         .5   .1179638     4.24   0.002     .2331475    .7668526

_cons |   3.000909   1.125303     2.67   0.026     .4552978     5.54652

------------------------------------------------------------------------------

. predict m2

(option xb assumed; fitted values)

* create predicted values

. gen resid_m2= y2- m2

* generate residuals

. twoway (scatter  resid_m2 x2)

* plot the residuals against x

. rvfplot, yline(0)

* plot residuals against the fits (residual versus fit plot). And of course we note that the residuals have a strikingly nonrandom pattern, which we easily could have determined from the first graph, of the actual data and the best fit line…

. exit, clear