--------------------------------------------------------------------------------------------------
name: <unnamed>
log: C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pages\soc_meth_proj3\fall_2011_381_logs\class8.log
log type: text
opened on: 20 Oct 2011, 13:23:48
. use "C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", clear
* Take a look at Excel worksheet pages on regression additivity, and regression linearity
. table yrsed sex if age>=30 & age<=39 & incwage>0, contents(freq mean incwage)
------------------------------------
based on | Sex
educrec | Male Female
----------+-------------------------
0 | 21 12
| 17776.19048 14534.33333
|
2.5 | 94 39
| 18555.43617 10650.89744
|
6.5 | 326 204
| 20013.61963 12691.59314
|
9 | 173 101
| 18950.9422 10504.71287
|
10 | 180 151
| 22419.21111 11830.55629
|
11 | 235 178
| 22384.86383 13230.02247
|
12 | 3,082 2,510
| 31565.6486 18713.77171
|
14 | 2,269 2,380
| 37670.1353 22863.12983
|
17 | 2,506 2,281
| 59410.84158 37053.81149
------------------------------------
. table yrsed sex if age>=30 & age<=39 & incwage>0, contents(freq mean incwage p25 incwage median incwage p75 incwage)
------------------------------------
based on | Sex
educrec | Male Female
----------+-------------------------
0 | 21 12
| 17776.19048 14534.33333
| 12480 7500
| 15000 12340
| 22000 14500
|
2.5 | 94 39
| 18555.43617 10650.89744
| 12000 7600
| 16000 10000
| 23000 14820
|
6.5 | 326 204
| 20013.61963 12691.59314
| 12000 6450
| 16950 11930
| 25000 15220
|
9 | 173 101
| 18950.9422 10504.71287
| 12000 4300
| 17000 10000
| 24000 14000
|
10 | 180 151
| 22419.21111 11830.55629
| 12740 5600
| 20000 10404
| 29000 16000
|
11 | 235 178
| 22384.86383 13230.02247
| 13000 6000
| 20000 11880
| 30000 18000
|
12 | 3,082 2,510
| 31565.6486 18713.77171
| 18200 10000
| 28321 16900
| 40000 25000
|
14 | 2,269 2,380
| 37670.1353 22863.12983
| 23777 11000
| 34000 20000
| 46000 30000
|
17 | 2,506 2,281
| 59410.84158 37053.81149
| 32300 20000
| 50000 32875
| 70000 47500
------------------------------------
. regress incwage yrsed male if age>=30 & age<=39 & incwage>0
Source | SS df MS Number of obs = 16742
-------------+------------------------------ F( 2, 16739) = 1895.13
Model | 2.6827e+12 2 1.3413e+12 Prob > F = 0.0000
Residual | 1.1848e+13 16739 707778375 R-squared = 0.1846
-------------+------------------------------ Adj R-squared = 0.1845
Total | 1.4530e+13 16741 867939018 Root MSE = 26604
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yrsed | 3636.148 73.18493 49.68 0.000 3492.698 3779.598
male | 16014.47 412.5284 38.82 0.000 15205.88 16823.07
_cons | -25264.87 1050.082 -24.06 0.000 -27323.14 -23206.6
------------------------------------------------------------------------------
. *we will call this Model 1
. predict linear_M1
(option xb assumed; fitted values)
(30484 missing values generated)
. table yrsed sex if age>=30 & age<=39 & incwage>0, contents(freq mean incwage mean linear_M1)
------------------------------------
based on | Sex
educrec | Male Female
----------+-------------------------
0 | 21 12
| 17776.19048 14534.33333
| -9250.396 -25264.87
|
2.5 | 94 39
| 18555.43617 10650.89744
| -160.0259 -16174.5
|
6.5 | 326 204
| 20013.61963 12691.59314
| 14384.57 -1629.909
|
9 | 173 101
| 18950.9422 10504.71287
| 23474.94 7460.46
|
10 | 180 151
| 22419.21111 11830.55629
| 27111.08 11096.61
|
11 | 235 178
| 22384.86383 13230.02247
| 30747.23 14732.76
|
12 | 3,082 2,510
| 31565.6486 18713.77171
| 34383.38 18368.9
|
14 | 2,269 2,380
| 37670.1353 22863.12983
| 41655.68 25641.2
|
17 | 2,506 2,281
| 59410.84158 37053.81149
| 52564.12 36549.64
------------------------------------
. gen HS=0
. replace HS=1 if yrsed==12
(33461 real changes made)
. gen byte Assoc=0
. replace Assoc=1 if yrsed==14
(25883 real changes made)
. gen byte BA_plus=0
. replace BA_plus=1 if yrsed==17
(21814 real changes made)
. regress incwage HS Assoc BA_plus male if age>=30 & age<=39 & incwage>0
Source | SS df MS Number of obs = 16742
-------------+------------------------------ F( 4, 16737) = 1071.59
Model | 2.9625e+12 4 7.4062e+11 Prob > F = 0.0000
Residual | 1.1568e+13 16737 691144260 R-squared = 0.2039
-------------+------------------------------ Adj R-squared = 0.2037
Total | 1.4530e+13 16741 867939018 Root MSE = 26290
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
HS | 9300.967 726.1082 12.81 0.000 7877.718 10724.22
Assoc | 14583.65 744.3114 19.59 0.000 13124.72 16042.58
BA_plus | 32695.45 740.6737 44.14 0.000 31243.65 34147.25
male | 15691.06 408.079 38.45 0.000 14891.18 16490.94
_cons | 7848.007 680.6283 11.53 0.000 6513.903 9182.11
------------------------------------------------------------------------------
. predict M2_nonlinear
(option xb assumed; fitted values)
. table yrsed sex if age>=30 & age<=39 & incwage>0, contents(freq mean incwage mean M2_nonlinear)
------------------------------------
based on | Sex
educrec | Male Female
----------+-------------------------
0 | 21 12
| 17776.19048 14534.33333
| 23539.06 7848.006
|
2.5 | 94 39
| 18555.43617 10650.89744
| 23539.06 7848.006
|
6.5 | 326 204
| 20013.61963 12691.59314
| 23539.06 7848.006
|
9 | 173 101
| 18950.9422 10504.71287
| 23539.06 7848.006
|
10 | 180 151
| 22419.21111 11830.55629
| 23539.06 7848.006
|
11 | 235 178
| 22384.86383 13230.02247
| 23539.06 7848.006
|
12 | 3,082 2,510
| 31565.6486 18713.77171
| 32840.03 17148.97
|
14 | 2,269 2,380
| 37670.1353 22863.12983
| 38122.71 22431.66
|
17 | 2,506 2,281
| 59410.84158 37053.81149
| 56234.51 40543.46
------------------------------------
. table educrec, contents(mean yrsed)
-------------------------------------
Educational attainment |
recode | mean(yrsed)
------------------------+------------
NIU |
None or preschool | 0
Grades 1, 2, 3, or 4 | 2.5
Grades 5, 6, 7, or 8 | 6.5
Grade 9 | 9
Grade 10 | 10
Grade 11 | 11
Grade 12 | 12
1 to 3 years of college | 14
4+ years of college | 17
-------------------------------------
* And now a brief look at what changes and what doesn't change in regression when we change the inputs.
. regress incwage yrsed male if age>=30 & age<=39 & incwage>0
Source | SS df MS Number of obs = 16742
-------------+------------------------------ F( 2, 16739) = 1895.13
Model | 2.6827e+12 2 1.3413e+12 Prob > F = 0.0000
Residual | 1.1848e+13 16739 707778375 R-squared = 0.1846
-------------+------------------------------ Adj R-squared = 0.1845
Total | 1.4530e+13 16741 867939018 Root MSE = 26604
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yrsed | 3636.148 73.18493 49.68 0.000 3492.698 3779.598
male | 16014.47 412.5284 38.82 0.000 15205.88 16823.07
_cons | -25264.87 1050.082 -24.06 0.000 -27323.14 -23206.6
------------------------------------------------------------------------------
. regress incwage yrsed female if age>=30 & age<=39 & incwage>0
Source | SS df MS Number of obs = 16742
-------------+------------------------------ F( 2, 16739) = 1895.13
Model | 2.6827e+12 2 1.3413e+12 Prob > F = 0.0000
Residual | 1.1848e+13 16739 707778375 R-squared = 0.1846
-------------+------------------------------ Adj R-squared = 0.1845
Total | 1.4530e+13 16741 867939018 Root MSE = 26604
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yrsed | 3636.148 73.18493 49.68 0.000 3492.698 3779.598
female | -16014.47 412.5284 -38.82 0.000 -16823.07 -15205.88
_cons | -9250.395 1025.037 -9.02 0.000 -11259.58 -7241.214
------------------------------------------------------------------------------
* Changing the excluded category of gender, from female to male, reverses that coefficient, the SE is the same, so the T-statistic is reversed but it still means the same thing. The yrsed coefficient, SE and T-stat are unchanged, as is the R-square. The constant has changed. The model is exactly the same in substance, but different in appearance.
. regress incwage yrsed female i.metro if age>=30 & age<=39 & incwage>0
Source | SS df MS Number of obs = 16742
-------------+------------------------------ F( 6, 16735) = 685.40
Model | 2.8662e+12 6 4.7771e+11 Prob > F = 0.0000
Residual | 1.1664e+13 16735 696977740 R-squared = 0.1973
-------------+------------------------------ Adj R-squared = 0.1970
Total | 1.4530e+13 16741 867939018 Root MSE = 26400
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yrsed | 3554.301 72.86128 48.78 0.000 3411.485 3697.117
female | -15890.47 409.5059 -38.80 0.000 -16693.15 -15087.8
|
metro |
1 | -481.5509 3800.688 -0.13 0.899 -7931.301 6968.199
2 | 4539.458 3793.82 1.20 0.232 -2896.83 11975.75
3 | 8288.016 3785.255 2.19 0.029 868.5163 15707.52
4 | 2866.183 3811 0.75 0.452 -4603.78 10336.15
|
_cons | -13037.94 3914.939 -3.33 0.001 -20711.63 -5364.245
------------------------------------------------------------------------------
* When you add a new variable, everything changes, but here N stays the same (because there appear to be no missing values for metro) and R-square goes up a little bit.
. regress incwage yrsed female ib4.metro if age>=30 & age<=39 & incwage>0
Source | SS df MS Number of obs = 16742
-------------+------------------------------ F( 6, 16735) = 685.40
Model | 2.8662e+12 6 4.7771e+11 Prob > F = 0.0000
Residual | 1.1664e+13 16735 696977740 R-squared = 0.1973
-------------+------------------------------ Adj R-squared = 0.1970
Total | 1.4530e+13 16741 867939018 Root MSE = 26400
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yrsed | 3554.301 72.86128 48.78 0.000 3411.485 3697.117
female | -15890.47 409.5059 -38.80 0.000 -16693.15 -15087.8
|
metro |
0 | -2866.183 3811 -0.75 0.452 -10336.15 4603.78
1 | -3347.734 714.7852 -4.68 0.000 -4748.788 -1946.679
2 | 1673.275 679.1819 2.46 0.014 342.007 3004.544
3 | 5421.833 631.1918 8.59 0.000 4184.631 6659.036
|
_cons | -10171.76 1146.288 -8.87 0.000 -12418.6 -7924.91
------------------------------------------------------------------------------
* Changing the excluded category of metro gives very different looking metro coefficients, and T-statistics, and the constant changes, but the model is the same model, and the other variables are unchanged.
. log close
name: <unnamed>
log: C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web
> pages\soc_meth_proj3\fall_2011_381_logs\class8.log
log type: text
closed on: 20 Oct 2011, 15:31:08
--------------------------------------------------------------------------------