-----------------------------------------------------------------------------------
name: <unnamed>
log: C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pages\soc_meth_proj3\2010_logs\section_four.log
log type: text
opened on:
. use "C:\Documents and Settings\Michael Rosenfeld\Desktop\cps_mar_2000_new.dta", clear
. regress incwage female
Source | SS df MS Number of obs = 103226
-------------+------------------------------ F( 1,103224) = 5006.28
Model | 3.9723e+12 1 3.9723e+12 Prob > F = 0.0000
Residual | 8.1905e+13103224 793465967 R-squared = 0.0463
-------------+------------------------------ Adj R-squared = 0.0462
Total | 8.5877e+13103225 831940347 Root MSE = 28169
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -12418.63 175.5159 -70.76 0.000 -12762.64 -12074.63
_cons | 25943.8 126.7965 204.61 0.000 25695.28 26192.32
------------------------------------------------------------------------------
. regress incwage female Korean_vet
Source | SS df MS Number of obs = 92865
-------------+------------------------------ F( 2, 92862) = 2711.18
Model | 4.0347e+12 2 2.0173e+12 Prob > F = 0.0000
Residual | 6.9097e+13 92862 744082727 R-squared = 0.0552
-------------+------------------------------ Adj R-squared = 0.0551
Total | 7.3132e+13 92864 787514009 Root MSE = 27278
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -13172.95 183.1608 -71.92 0.000 -13531.94 -12813.96
Korean_vet | -17929.45 672.63 -26.66 0.000 -19247.8 -16611.11
_cons | 26628.72 140.0058 190.20 0.000 26354.31 26903.13
------------------------------------------------------------------------------
. regress incwage female Korean_vet [aweight= perwt_rounded]
(sum of wgt is 1.9224e+08)
Source | SS df MS Number of obs = 92865
-------------+------------------------------ F( 2, 92862) = 2705.82
Model | 4.2126e+12 2 2.1063e+12 Prob > F = 0.0000
Residual | 7.2286e+13 92862 778428329 R-squared = 0.0551
-------------+------------------------------ Adj R-squared = 0.0550
Total | 7.6499e+13 92864 823774386 Root MSE = 27900
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -13439.69 187.0091 -71.87 0.000 -13806.23 -13073.16
Korean_vet | -18314.59 695.6556 -26.33 0.000 -19678.07 -16951.11
_cons | 27284.84 142.3742 191.64 0.000 27005.79 27563.9
------------------------------------------------------------------------------
. *regress assumes all the predictor variables are continuous, and you use i.variable to indicate otherwise. desmat reverses that assumption, and assumes that all the predictors are categorical, and you use @ to indicate continuous variables.
. display -13439/187
-71.86631
. *get used to seeing the t statistic as beta divided by its standard error
. regress incwage female Korean_vet i.metro [aweight= perwt_rounded]
(sum of wgt is 1.9224e+08)
Source | SS df MS Number of obs = 92865
-------------+------------------------------ F( 6, 92858) = 1101.29
Model | 5.0820e+12 6 8.4700e+11 Prob > F = 0.0000
Residual | 7.1417e+13 92858 769098857 R-squared = 0.0664
-------------+------------------------------ Adj R-squared = 0.0664
Total | 7.6499e+13 92864 823774386 Root MSE = 27733
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -13379.51 185.9099 -71.97 0.000 -13743.89 -13015.12
Korean_vet | -17933.67 691.8545 -25.92 0.000 -19289.69 -16577.64
|
metro |
1 | -8232.328 1892.845 -4.35 0.000 -11942.29 -4522.371
2 | -3490.151 1890.107 -1.85 0.065 -7194.742 214.4387
3 | -47.68832 1886.323 -0.03 0.980 -3744.861 3649.485
4 | -4783.277 1897.19 -2.52 0.012 -8501.75 -1064.805
|
_cons | 30322.75 1884.631 16.09 0.000 26628.89 34016.6
------------------------------------------------------------------------------
. *now let me do the same regression with desmat
. desmat: regress incwage female Korean_vet metro [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 92865
aweight: perwt_rounded
F statistic: 1101.289
Model degrees of freedom: 6
Residual degrees of freedom: 92858
R-squared: 0.066
Adjusted R-squared: 0.066
Root MSE 27732.632
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -13379.506** 185.910
Korean_vet
2 1 -17933.665** 691.854
metro
3 Not in metro area -8232.328** 1892.845
4 Central city -3490.151 1890.107
5 Outside central city -47.688 1886.323
6 Central city status unknown -4783.277* 1897.190
7 _cons 30322.747** 1884.631
-----------------------------------------------------------------------------------
* p < .05
** p < .01
* note the "@" in front of age in the below regression.
.
desmat: regress incwage female Korean_vet metro @age [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 92865
aweight: perwt_rounded
F statistic: 946.636
Model degrees of freedom: 7
Residual degrees of freedom: 92857
R-squared: 0.067
Adjusted R-squared: 0.067
Root MSE 27730.163
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -13514.761** 188.678
Korean_vet
2 1 -18581.637** 708.887
metro
3 Not in metro area -8236.839** 1892.677
4 Central city -3442.369 1889.973
5 Outside central city -24.318 1886.163
6 Central city status unknown -4759.511* 1897.030
7 Age 21.869** 5.222
8 _cons 29469.076** 1895.457
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. gen age_sq=age^2
*this is how we generate the age-squared variable. Notice on your variable list that it will be added to the bottom.
. desmat: regress incwage female Korean_vet metro @age @age_sq [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 92865
aweight: perwt_rounded
F statistic: 2535.406
Model degrees of freedom: 8
Residual degrees of freedom: 92856
R-squared: 0.179
Adjusted R-squared: 0.179
Root MSE 26002.864
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -12728.320** 177.063
Korean_vet
2 1 -14957.251** 665.505
metro
3 Not in metro area -6558.295** 1774.846
4 Central city -2033.091 1772.292
5 Outside central city 828.436 1768.691
6 Central city status unknown -3276.152 1778.913
7 Age 2494.205** 22.439
8 age_sq -26.561** 0.235
9 _cons -20601.844** 1831.883
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. *by adding the age-squared term, we improved the R-square from 6.7% to 17.9%, a really big improvement.
*The T-statistic is just the coefficient divided by its standard error, or, for age:
. display 2494/22.439
111.14577
. desmat: regress incwage female Korean_vet metro @age_sq [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 92865
aweight: perwt_rounded
F statistic: 999.508
Model degrees of freedom: 7
Residual degrees of freedom: 92857
R-squared: 0.070
Adjusted R-squared: 0.070
Root MSE 27678.724
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -12749.202** 188.474
Korean_vet
2 1 -14919.641** 708.397
metro
3 Not in metro area -8146.532** 1889.171
4 Central city -3646.682 1886.451
5 Outside central city -117.839 1882.660
6 Central city status unknown -4830.460* 1893.504
7 age_sq -1.041** 0.055
8 _cons 32143.370** 1883.393
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. *leaving out age and including only age-squared (as we do above) is a bad idea, and pushes our r-square back to 7%. If we have age-square in the model, we should also have age.
. desrep, zval prob
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 92865
aweight: perwt_rounded
F statistic: 999.508
Model degrees of freedom: 7
Residual degrees of freedom: 92857
R-squared: 0.070
Adjusted R-squared: 0.070
Root MSE 27678.724
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e. t prob
-----------------------------------------------------------------------------------
female
1 1 -12749.202** 188.474 -67.644 0.000
Korean_vet
2 1 -14919.641** 708.397 -21.061 0.000
metro
3 Not in metro area -8146.532** 1889.171 -4.312 0.000
4 Central city -3646.682 1886.451 -1.933 0.053
5 Outside central city -117.839 1882.660 -0.063 0.950
6 Central city status unknown -4830.460* 1893.504 -2.551 0.011
7 age_sq -1.041** 0.055 -19.054 0.000
8 _cons 32143.370** 1883.393 17.067 0.000
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. *if you are using desmat, it is handy to also use desrep, which is installed automatically with desmat. After a regression, desmat will give you a regression output table and you can control what the output looks like. In this case we use desmat, zval prob to generate the output with the z or t statistics, and probability as well.
. regress incwage female Korean_vet i.metro age age_sq [aweight= perwt_rounded]
(sum of wgt is 1.9224e+08)
Source | SS df MS Number of obs = 92865
-------------+------------------------------ F( 8, 92856) = 2535.41
Model | 1.3714e+13 8 1.7143e+12 Prob > F = 0.0000
Residual | 6.2784e+13 92856 676148962 R-squared = 0.1793
-------------+------------------------------ Adj R-squared = 0.1792
Total | 7.6499e+13 92864 823774386 Root MSE = 26003
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -12728.32 177.0628 -71.89 0.000 -13075.36 -12381.28
Korean_vet | -14957.25 665.5054 -22.48 0.000 -16261.63 -13652.87
|
metro |
1 | -6558.295 1774.846 -3.70 0.000 -10036.97 -3079.617
2 | -2033.091 1772.292 -1.15 0.251 -5506.764 1440.582
3 | 828.4365 1768.691 0.47 0.640 -2638.18 4295.053
4 | -3276.152 1778.913 -1.84 0.066 -6762.803 210.4993
|
age | 2494.205 22.43866 111.16 0.000 2450.225 2538.184
age_sq | -26.56104 .2352543 -112.90 0.000 -27.02213 -26.09994
_cons | -20601.84 1831.883 -11.25 0.000 -24192.31 -17011.37
------------------------------------------------------------------------------
. xi: regress incwage female Korean_vet i.metro age_sq [aweight= perwt_rounded]
i.metro _Imetro_0-4 (naturally coded; _Imetro_0 omitted)
(sum of wgt is 1.9224e+08)
Source | SS df MS Number of obs = 92865
-------------+------------------------------ F( 7, 92857) = 999.51
Model | 5.3601e+12 7 7.6573e+11 Prob > F = 0.0000
Residual | 7.1139e+13 92857 766111788 R-squared = 0.0701
-------------+------------------------------ Adj R-squared = 0.0700
Total | 7.6499e+13 92864 823774386 Root MSE = 27679
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -12749.2 188.4743 -67.64 0.000 -13118.61 -12379.79
Korean_vet | -14919.64 708.3965 -21.06 0.000 -16308.09 -13531.19
_Imetro_1 | -8146.532 1889.171 -4.31 0.000 -11849.29 -4443.776
_Imetro_2 | -3646.682 1886.451 -1.93 0.053 -7344.106 50.74231
_Imetro_3 | -117.8387 1882.66 -0.06 0.950 -3807.832 3572.155
_Imetro_4 | -4830.46 1893.504 -2.55 0.011 -8541.708 -1119.212
age_sq | -1.041315 .0546509 -19.05 0.000 -1.14843 -.9341998
_cons | 32143.37 1883.393 17.07 0.000 28451.94 35834.8
------------------------------------------------------------------------------
. desmat: regress incwage female Korean_vet metro @age @age_sq [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 92865
aweight: perwt_rounded
F statistic: 2535.406
Model degrees of freedom: 8
Residual degrees of freedom: 92856
R-squared: 0.179
Adjusted R-squared: 0.179
Root MSE 26002.864
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -12728.320** 177.063
Korean_vet
2 1 -14957.251** 665.505
metro
3 Not in metro area -6558.295** 1774.846
4 Central city -2033.091 1772.292
5 Outside central city 828.436 1768.691
6 Central city status unknown -3276.152 1778.913
7 Age 2494.205** 22.439
8 age_sq -26.561** 0.235
9 _cons -20601.844** 1831.883
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. tabulate vetlast
Veteran's most recent |
period of service | Freq. Percent Cum.
---------------------------+-----------------------------------
NIU | 30,904 23.11 23.11
No service | 91,149 68.17 91.28
World War II | 2,428 1.82 93.10
Korean War | 1,716 1.28 94.38
Other service | 3,830 2.86 100.00
---------------------------+-----------------------------------
Total | 133,710 100.00
. codebook vetlast
. codebook vetlast
-----------------------------------------------------------------------------------
vetlast Veteran's most recent period of service
-----------------------------------------------------------------------------------
type: numeric (byte)
label: vetlastlbl
range: [0,9] units: 1
unique values: 6 missing .: 0/133710
tabulation: Freq. Numeric Label
30904 0 NIU
91149 1 No service
2428 4 World War II
1716 6 Korean War
3683 8
3830 9 Other service
. gen korean_vet_new=0
. replace korean_vet_new=1 if vetlast==6
(1716 real changes made)
. desmat: regress incwage female korean_vet_new metro @age @age_sq [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable incwage
Number of observations: 103226
aweight: perwt_rounded
F statistic: 2837.895
Model degrees of freedom: 8
Residual degrees of freedom: 103217
R-squared: 0.180
Adjusted R-squared: 0.180
Root MSE 26676.307
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -12476.602** 168.069
korean_vet_new
2 1 -14247.101** 674.085
metro
3 Not in metro area -4837.811** 1685.612
4 Central city -191.578 1683.374
5 Outside central city 2793.861 1679.635
6 Central city status unknown -1278.184 1689.596
7 Age 2560.560** 21.927
8 age_sq -27.464** 0.227
9 _cons -23534.293** 1745.882
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. desmat: regress inctot female korean_vet_new metro @age @age_sq [aweight= perwt_rounded]
-----------------------------------------------------------------------------------
Linear regression
-----------------------------------------------------------------------------------
Dependent variable inctot
Number of observations: 103226
aweight: perwt_rounded
F statistic: 2545.734
Model degrees of freedom: 8
Residual degrees of freedom: 103217
R-squared: 0.165
Adjusted R-squared: 0.165
Root MSE 29926.690
Prob: 0.000
-----------------------------------------------------------------------------------
nr Effect Coeff s.e.
-----------------------------------------------------------------------------------
female
1 1 -15702.652** 188.548
korean_vet_new
2 1 -4169.903** 756.219
metro
3 Not in metro area -6213.618** 1890.996
4 Central city -1081.445 1888.486
5 Outside central city 2576.141 1884.291
6 Central city status unknown -1796.468 1895.465
7 Age 2623.563** 24.599
8 age_sq -25.183** 0.254
9 _cons -22497.580** 1958.610
-----------------------------------------------------------------------------------
* p < .05
** p < .01
. *the disadvantage of Korean war vets is much less in inctot than in incwage. I suspect this is because veterans get benefits
. * The key for this regression process is that instead of throwing out the women and the people who don't fall in the same age range as the Korean vets (as we did in HW1 when we were trying to compare vets to non-vets the same age), here we use regression to account for age and gender, and then see if there is any residual difference between Korean vets and non vets after controling for age and gender. And it turns out that there is. Korean vets actually make less than similarly
. log close
name: <unnamed>
log: C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pag
> es\soc_meth_proj3\2010_logs\section_four.log
log type: text
closed on:
-----------------------------------------------------------------------------------