---------------------------------------------------------------------------------------------------------
log: C:\AAA Miker Files\newer web pages\soc_388_notes\soc_388_2007\class_nine_log.log
log type: text
opened on: 23 Oct 2007, 11:08:28
. use "C:\AAA Miker Files\newer web pages\soc_388_notes\LA_intermar.dta", clear
. clear
. use "C:\AAA Miker Files\newer web pages\soc_388_notes\LA intermar class 5.dta", clear
. *Let me just show you something quickly about how to test parameters, using the HW2 dataset, LA intermarriage
. set linesize 79
. desmat: poisson count hed wed interfull
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 16
Initial log likelihood: -221501.223
Log likelihood: -24059.274
LR chi square: 394883.898
Model degrees of freedom: 10
Pseudo R-squared: 0.891
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
hed
1 2 1.134** 0.007
2 3 0.819** 0.006
3 4 -0.017* 0.007
wed
4 2 1.372** 0.007
5 3 1.020** 0.007
6 4 -0.278** 0.008
interfull
7 1 1.722** 0.009
8 2 0.676** 0.007
9 3 0.537** 0.008
10 4 2.487** 0.009
11 _cons 8.652** 0.008
-------------------------------------------------------------------------------
* p < .05
** p < .01
. *How to test whether one coefficient is significantly different from another.
> .
. lincom _x_8-_x_9
( 1) [count]_x_8 - [count]_x_9 = 0
------------------------------------------------------------------------------
count | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | .139055 .0122419 11.36 0.000 .1150614 .1630486
------------------------------------------------------------------------------
. *significant difference between intermarriage for category 2 and 3
. test _x_8-_x_9=0
( 1) [count]_x_8 - [count]_x_9 = 0
chi2( 1) = 129.03
Prob > chi2 = 0.0000
. *Are these tests the same?
. display 11.36^2
129.0496
. *something we will talk about later in the class, is why the chisquare (1) is really just the square of a standard Normal distribution. But here in any event, you can see that the two tests (lincom and test) provide the same information.
. clear
. *now on to use the Qian dataset.
. use "C:\AAA Miker Files\newer web pages\soc_388_notes\Qian 80-90 intermar.dta ", clear
. desmat: poisson count mfulleth med4 ffulleth fed4 year
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 512
Initial log likelihood: -1402408.286
Log likelihood: -289165.150
LR chi square: 2226486.273
Model degrees of freedom: 13
Pseudo R-squared: 0.794
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
mfulleth
1 Hisp 3.431** 0.037
2 black 3.894** 0.037
3 white 6.437** 0.037
med4
4 2 1.072** 0.004
5 3 0.595** 0.005
6 4 0.235** 0.005
ffulleth
7 Hisp 3.262** 0.035
8 black 3.695** 0.034
9 white 6.281** 0.034
fed4
10 2 1.229** 0.004
11 3 0.733** 0.005
12 4 0.142** 0.005
year
13 90 -0.415** 0.003
14 _cons -4.279** 0.050
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 576133.3
Prob > chi2(498) = 0.0000
. table mfulleth ffulleth, contents (mean r_endog)
--------------------------------------
| ffulleth
mfulleth | Asian Hisp black white
----------+---------------------------
Asian | 1 0 0 0
Hisp | 0 2 0 0
black | 0 0 3 0
white | 0 0 0 4
--------------------------------------
. desmat: poisson count mfulleth*med4 ffulleth*fed4 year r_endog
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 512
Initial log likelihood: -1402408.286
Log likelihood: -123146.804
LR chi square: 2558522.965
Model degrees of freedom: 35
Pseudo R-squared: 0.912
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
mfulleth
1 Hisp 5.526** 0.222
2 black 3.649** 0.223
3 white 6.035** 0.224
med4
4 2 1.846** 0.235
5 3 2.461** 0.227
6 4 2.790** 0.225
mfulleth.med4
7 Hisp.2 -1.506** 0.235
8 Hisp.3 -2.624** 0.228
9 Hisp.4 -4.150** 0.226
10 black.2 -0.972** 0.235
11 black.3 -2.149** 0.228
12 black.4 -3.553** 0.226
13 white.2 -0.701** 0.235
14 white.3 -1.786** 0.227
15 white.4 -2.413** 0.225
ffulleth
16 Hisp 4.831** 0.175
17 black 1.810** 0.178
18 white 5.377** 0.177
fed4
19 2 1.705** 0.186
20 3 2.105** 0.182
21 4 2.387** 0.179
ffulleth.fed4
22 Hisp.2 -1.235** 0.187
23 Hisp.3 -2.235** 0.183
24 Hisp.4 -3.838** 0.182
25 black.2 -0.708** 0.187
26 black.3 -1.550** 0.182
27 black.4 -2.896** 0.180
28 white.2 -0.400* 0.186
29 white.3 -1.297** 0.182
30 white.4 -2.123** 0.179
year
31 90 -0.415** 0.003
r_endog
32 1 4.497** 0.089
33 2 2.107** 0.042
34 3 6.914** 0.050
35 4 2.871** 0.042
36 _cons -5.892** 0.282
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 244096.6
Prob > chi2(476) = 0.0000
. *For this class I am not bothering with deviation coding for the variables because I don't care about the coefficients today, I just want to look at goodness of fit and number of terms.
. desmat: poisson count mfulleth*med4*fed4 ffulleth*med4*fed4 year r_endog
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 512
Initial log likelihood: -1402408.286
Log likelihood: -10672.055
LR chi square: 2783472.463
Model degrees of freedom: 116
Pseudo R-squared: 0.992
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
mfulleth
1 Hisp 5.374** 0.382
2 black 3.227** 0.388
3 white 5.893** 0.383
med4
4 2 0.639 0.566
5 3 0.651 0.620
6 4 -1.406 1.305
mfulleth.med4
7 Hisp.2 -1.193* 0.470
8 Hisp.3 -1.907** 0.521
9 Hisp.4 -2.073 1.135
10 black.2 -0.650 0.479
11 black.3 -1.098* 0.538
12 black.4 -2.042 1.203
13 white.2 -0.818 0.469
14 white.3 -1.637** 0.519
15 white.4 -1.278 1.125
fed4
16 2 0.343 0.584
17 3 -1.335 1.119
18 4 -0.738 1.116
mfulleth.fed4
19 Hisp.2 -0.881 0.482
20 Hisp.3 0.524 1.080
21 Hisp.4 -1.202 1.149
22 black.2 -0.600 0.491
23 black.3 0.987 1.088
24 black.4 -0.922 1.205
25 white.2 -0.568 0.481
26 white.3 0.754 1.079
27 white.4 -0.811 1.141
med4.fed4
28 2.2 1.901** 0.699
29 2.3 3.225** 1.189
30 2.4 0.520 1.283
31 3.2 1.712* 0.749
32 3.3 4.484** 1.207
33 3.4 3.269** 1.208
34 4.2 2.680 1.394
35 4.3 6.282** 1.667
36 4.4 6.928** 1.661
mfulleth.med4.fed4
37 Hisp.2.2 0.506 0.567
38 Hisp.2.3 -1.314 1.127
39 Hisp.2.4 0.755 1.294
40 Hisp.3.2 0.613 0.616
41 Hisp.3.3 -1.314 1.141
42 Hisp.3.4 0.316 1.215
43 Hisp.4.2 -0.458 1.191
44 Hisp.4.3 -1.875 1.525
45 Hisp.4.4 -0.390 1.572
46 black.2.2 0.361 0.579
47 black.2.3 -1.549 1.138
48 black.2.4 0.944 1.351
49 black.3.2 0.335 0.635
50 black.3.3 -1.734 1.155
51 black.3.4 0.284 1.275
52 black.4.2 0.189 1.264
53 black.4.3 -1.352 1.583
54 black.4.4 0.055 1.662
55 white.2.2 0.625 0.566
56 white.2.3 -1.074 1.125
57 white.2.4 1.284 1.284
58 white.3.2 0.746 0.613
59 white.3.3 -1.024 1.139
60 white.3.4 0.648 1.205
61 white.4.2 -0.325 1.180
62 white.4.3 -1.472 1.516
63 white.4.4 -0.034 1.558
ffulleth
64 Hisp 4.401** 0.271
65 black 1.628** 0.280
66 white 5.213** 0.271
ffulleth.med4
67 Hisp.2 -0.039 0.401
68 Hisp.3 -0.558 0.485
69 Hisp.4 -0.780 1.101
70 black.2 -0.173 0.411
71 black.3 -1.091* 0.503
72 black.4 -0.072 1.173
73 white.2 0.115 0.400
74 white.3 -0.493 0.482
75 white.4 -0.721 1.092
ffulleth.fed4
76 Hisp.2 -0.016 0.409
77 Hisp.3 -1.071* 0.470
78 Hisp.4 -2.292** 0.695
79 black.2 0.278 0.419
80 black.3 -0.660 0.488
81 black.4 -1.463 0.781
82 white.2 0.330 0.408
83 white.3 -0.742 0.468
84 white.4 -1.903** 0.679
ffulleth.med4.fed4
85 Hisp.2.2 -1.019* 0.514
86 Hisp.2.3 -0.410 0.574
87 Hisp.2.4 0.143 0.817
88 Hisp.3.2 -0.710 0.591
89 Hisp.3.3 -0.003 0.627
90 Hisp.3.4 0.140 0.816
91 Hisp.4.2 -0.028 1.186
92 Hisp.4.3 -0.313 1.174
93 Hisp.4.4 -0.076 1.276
94 black.2.2 -0.871 0.528
95 black.2.3 -0.213 0.595
96 black.2.4 0.357 0.903
97 black.3.2 -0.507 0.612
98 black.3.3 0.253 0.653
99 black.3.4 0.253 0.902
100 black.4.2 -1.177 1.260
101 black.4.3 -1.412 1.249
102 black.4.4 -0.946 1.385
103 white.2.2 -0.989 0.513
104 white.2.3 -0.348 0.570
105 white.2.4 0.519 0.799
106 white.3.2 -0.536 0.588
107 white.3.3 0.108 0.623
108 white.3.4 0.588 0.800
109 white.4.2 0.279 1.175
110 white.4.3 -0.019 1.164
111 white.4.4 0.625 1.259
year
112 90 -0.415** 0.003
r_endog
113 1 4.139** 0.095
114 2 2.003** 0.043
115 3 6.921** 0.050
116 4 2.878** 0.042
117 _cons -4.370** 0.438
-------------------------------------------------------------------------------
* p < .05
** p < .01
. *Important note: desmat generates all the lower order interactions for you, automatically, when you use the * to put variables together. Also note that these models are summarized in my comprehensive Excel file.
. poisgof
Goodness-of-fit chi2 = 19147.11
Prob > chi2(395) = 0.0000
. *This Model still fits terribly, however, what do we think is missing?
. *What's missing? take the previous model, and add every other variable interacted with year...
. desmat: poisson count mfulleth*med4*fed4 ffulleth*med4*fed4 mfulleth*year ffulleth*year med4*year fed4*year r_endog
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 512
Initial log likelihood: -1402408.286
Log likelihood: -2532.653
LR chi square: 2799751.267
Model degrees of freedom: 128
Pseudo R-squared: 0.998
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
mfulleth
1 Hisp 5.304** 0.384
2 black 3.147** 0.389
3 white 5.796** 0.384
med4
4 2 0.672 0.566
5 3 0.646 0.620
6 4 -1.305 1.306
mfulleth.med4
7 Hisp.2 -1.186* 0.470
8 Hisp.3 -1.908** 0.521
9 Hisp.4 -2.053 1.135
10 black.2 -0.642 0.479
11 black.3 -1.099* 0.538
12 black.4 -2.018 1.203
13 white.2 -0.809 0.469
14 white.3 -1.639** 0.519
15 white.4 -1.249 1.125
fed4
16 2 0.359 0.584
17 3 -1.551 1.119
18 4 -0.930 1.116
mfulleth.fed4
19 Hisp.2 -0.878 0.482
20 Hisp.3 0.492 1.080
21 Hisp.4 -1.231 1.149
22 black.2 -0.596 0.491
23 black.3 0.949 1.088
24 black.4 -0.956 1.205
25 white.2 -0.563 0.481
26 white.3 0.709 1.079
27 white.4 -0.852 1.141
med4.fed4
28 2.2 1.900** 0.699
29 2.3 3.246** 1.189
30 2.4 0.539 1.283
31 3.2 1.713* 0.749
32 3.3 4.481** 1.207
33 3.4 3.266** 1.208
34 4.2 2.675 1.395
35 4.3 6.350** 1.668
36 4.4 6.989** 1.662
mfulleth.med4.fed4
37 Hisp.2.2 0.506 0.567
38 Hisp.2.3 -1.313 1.127
39 Hisp.2.4 0.756 1.294
40 Hisp.3.2 0.613 0.616
41 Hisp.3.3 -1.314 1.141
42 Hisp.3.4 0.316 1.215
43 Hisp.4.2 -0.459 1.191
44 Hisp.4.3 -1.870 1.525
45 Hisp.4.4 -0.386 1.571
46 black.2.2 0.361 0.579
47 black.2.3 -1.548 1.138
48 black.2.4 0.945 1.351
49 black.3.2 0.335 0.635
50 black.3.3 -1.734 1.155
51 black.3.4 0.284 1.275
52 black.4.2 0.188 1.263
53 black.4.3 -1.347 1.583
54 black.4.4 0.060 1.661
55 white.2.2 0.624 0.566
56 white.2.3 -1.073 1.125
57 white.2.4 1.285 1.284
58 white.3.2 0.746 0.613
59 white.3.3 -1.025 1.139
60 white.3.4 0.648 1.206
61 white.4.2 -0.325 1.180
62 white.4.3 -1.466 1.516
63 white.4.4 -0.029 1.558
ffulleth
64 Hisp 4.270** 0.272
65 black 1.615** 0.281
66 white 5.104** 0.272
ffulleth.med4
67 Hisp.2 -0.028 0.401
68 Hisp.3 -0.560 0.485
69 Hisp.4 -0.742 1.101
70 black.2 -0.172 0.411
71 black.3 -1.092* 0.503
72 black.4 -0.070 1.172
73 white.2 0.125 0.399
74 white.3 -0.495 0.482
75 white.4 -0.690 1.092
ffulleth.fed4
76 Hisp.2 -0.011 0.409
77 Hisp.3 -1.136* 0.470
78 Hisp.4 -2.350** 0.695
79 black.2 0.278 0.419
80 black.3 -0.667 0.488
81 black.4 -1.470 0.781
82 white.2 0.334 0.408
83 white.3 -0.796 0.468
84 white.4 -1.953** 0.679
ffulleth.med4.fed4
85 Hisp.2.2 -1.019* 0.514
86 Hisp.2.3 -0.408 0.574
87 Hisp.2.4 0.144 0.817
88 Hisp.3.2 -0.710 0.591
89 Hisp.3.3 -0.003 0.627
90 Hisp.3.4 0.140 0.816
91 Hisp.4.2 -0.030 1.185
92 Hisp.4.3 -0.304 1.174
93 Hisp.4.4 -0.068 1.276
94 black.2.2 -0.871 0.528
95 black.2.3 -0.213 0.595
96 black.2.4 0.357 0.903
97 black.3.2 -0.507 0.612
98 black.3.3 0.254 0.653
99 black.3.4 0.253 0.902
100 black.4.2 -1.178 1.260
101 black.4.3 -1.410 1.249
102 black.4.4 -0.944 1.385
103 white.2.2 -0.989 0.513
104 white.2.3 -0.346 0.570
105 white.2.4 0.520 0.799
106 white.3.2 -0.536 0.588
107 white.3.3 0.108 0.623
108 white.3.4 0.588 0.800
109 white.4.2 0.277 1.175
110 white.4.3 -0.011 1.163
111 white.4.4 0.632 1.259
year
112 90 -1.120** 0.091
mfulleth.year
113 Hisp.90 0.204* 0.090
114 black.90 0.235* 0.097
115 white.90 0.281** 0.089
ffulleth.year
116 Hisp.90 0.388** 0.083
117 black.90 0.044 0.091
118 white.90 0.331** 0.082
med4.year
119 2.90 -0.144** 0.009
120 3.90 0.021* 0.010
121 4.90 -0.504** 0.012
fed4.year
122 2.90 -0.066** 0.010
123 3.90 0.689** 0.011
124 4.90 0.625** 0.013
r_endog
125 1 4.120** 0.095
126 2 2.005** 0.043
127 3 6.920** 0.051
128 4 2.876** 0.042
129 _cons -4.126** 0.439
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 2868.306
Prob > chi2(383) = 0.0000
. *still poor fitting by LRT, but the BIC becomes negative here...
. tabulate med4 [fweight=count]
med4 | Freq. Percent Cum.
------------+-----------------------------------
1 | 74,785 14.28 14.28
2 | 218,475 41.73 56.01
3 | 135,645 25.91 81.92
4 | 94,637 18.08 100.00
------------+-----------------------------------
Total | 523,542 100.00
. display ln(523542)
13.168373
. display 2868-(383*13.168)
-2175.344
. *One of the things we will talk more about on Thursday, is how to be systematic about whether you need a certain set of interactions, or not. That is, howto go about building models so that you start to have some confidence that you have the important stuff in there.
. exit, clear