class 11 log

---------------------------------------------------------------------------------------------------------

log: C:\AAA Miker Files\newer web pages\soc_388_notes\soc_388_2007\class_eleven_log.log

log type: text

opened on: 30 Oct 2007, 11:04:47

. edit

(4 vars, 8 obs pasted into editor)

- preserve

. table defendant victim [fweight=count], contents (mean death_penalty freq) row col

-------------------------------------

| victim

defendant | black white Total

----------+--------------------------

black | .058252 .174603 .10241

| 103 63 166

white | 0 .125828 .11875

| 9 151 160

Total | .053571 .140187 .110429

| 112 214 326

-------------------------------------

. set linesize 79

. *first relevant issue: the crude pct who get the death penalty is a bit higher for white defendants than for black defendants. A bit surprising at first.

. *Secondly, people who kill whites are much more likely to get the death penalty, 14% to 5%

. *There is another interesting relationship here potentially, between race of victim and race of perpetrator, which not immediately obvious from the table.

. *How about crude odds ratio?

. display 103*151/(9*63)

27.430335

. display ln(103*151/(9*63))

3.3116495

. *So the log odds ratio of the interaction between perpetrator's race and victim's race is also high, 3.3

. *Let's look at a couple of loglinear models.

. desmat: poisson count defendant victim death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -86.805

LR chi square: 257.986

Model degrees of freedom: 3

Pseudo R-squared: 0.598

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

defendant

1 white -0.037 0.111

victim

2 white 0.647** 0.117

death_penalty

3 1 -2.086** 0.177

4 _cons 3.927** 0.111

----------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 137.9293

Prob > chi2(4) = 0.0000

. *Well, no surprise that the mutual independence model doesn't fit.

. desmat defendant*victim*death_penalty=dev(1)*dev(1)*dev(1)

Desmat generated the following design matrix:

nr Variables Term Parameterization

First Last

1 _x_1 defendant dev(1)

2 _x_2 victim dev(1)

3 _x_3 defendant.victim dev(1).dev(1)

4 _x_4 death_penalty dev(0)

5 _x_5 defendant.death_penalty dev(1).dev(0)

6 _x_6 victim.death_penalty dev(1).dev(0)

7 _x_7 defendant.victim.death_penaltydev(1).dev(1).dev(0)

. sw poisson count (_x_1 _x_2 _x_4) _x_3 _x_5 _x_6, forward pe(.001) pr(.05)

begin with empty model

p = 0.0000 < 0.0010 adding _x_1 _x_2 _x_4

p = 0.0000 < 0.0010 adding _x_3

Poisson regression Number of obs = 8

LR chi2(4) = 387.78

Prob > chi2 = 0.0000

Log likelihood = -21.906596 Pseudo R2 = 0.8985

------------------------------------------------------------------------------

count | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_x_1 | -.3908398 .0946425 -4.13 0.000 -.5763358 -.2053438

_x_2 | .5821152 .0946425 6.15 0.000 .3966193 .7676112

_x_4 | -1.043181 .0883545 -11.81 0.000 -1.216353 -.8700094

_x_3 | .8279124 .0946425 8.75 0.000 .6424164 1.013408

_cons | 2.837895 .1170309 24.25 0.000 2.608518 3.067271

------------------------------------------------------------------------------

. desrep

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -21.907

LR chi square: 387.784

Model degrees of freedom: 4

Pseudo R-squared: 0.898

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

defendant

1 white -0.391** 0.095

victim

2 white 0.582** 0.095

death_penalty

3 1 -1.043** 0.088

defendant.victim

4 white.white 0.828** 0.095

5 _cons 2.838** 0.117

----------------------------------------------------------------------------------

* p < .05

** p < .01

. *What we see here is that, with a .001 entry criteria, the only 2-way that gets put into the model is interaction between defendant race and victim race.

. *Let me see if I can run the model with indicator dummies to get the value of the odds ratio we got by hand.

. desmat: poisson count defendant*victim death_penalty=dev(1)

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -21.907

LR chi square: 387.784

Model degrees of freedom: 4

Pseudo R-squared: 0.898

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

defendant

1 white -2.438** 0.348

victim

2 white -0.492** 0.160

defendant.victim

3 white.white 3.312** 0.379

death_penalty

4 1 -1.043** 0.088

5 _cons 3.475** 0.120

----------------------------------------------------------------------------------

* p < .05

** p < .01

. *We get our 3.31 log odds ratio for the defendant-victim interaction.

. poisgof

Goodness-of-fit chi2 = 8.131552

Prob > chi2(3) = 0.0434

. *let's look at a couple of logistic regressions.

. desmat: logistic death_penalty victim [fweight=count]

----------------------------------------------------------------------------------

logistic

----------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 326

fweight: count

Initial log likelihood: -113.256

Log likelihood: -110.132

LR chi square: 6.250

Model degrees of freedom: 1

Pseudo R-squared: 0.028

Prob: 0.012

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

victim

1 white 1.058* 0.464

2 _cons -2.872** 0.420

----------------------------------------------------------------------------------

* p < .05

** p < .01

. lfit table

variable table not found

r(111);

. lfit, table

Logistic model for death_penalty, goodness-of-fit test

+--------------------------------------------------------+

| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |

|-------+--------+-------+-------+-------+-------+-------|

| 1 | 0.0536 | 6 | 6.0 | 106 | 106.0 | 112 |

| 2 | 0.1402 | 30 | 30.0 | 184 | 184.0 | 214 |

+--------------------------------------------------------+

+-----------------------+

| Group | Prob | _x_1 |

|-------+--------+------|

| 1 | 0.0536 | 0 |

| 2 | 0.1402 | 1 |

+-----------------------+

number of observations = 326

number of covariate patterns = 2

Pearson chi2(0) = 0.00

Prob > chi2 = .

. *What we get from the logistic regression, is a dataset that has 2 cells of victim's race by death_penalty, with the defendant's race dimension collapsed.

. *So, which loglinear model would be the equivalent to this logistic regression?

. desmat: poisson count victim*death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -83.736

LR chi square: 264.125

Model degrees of freedom: 3

Pseudo R-squared: 0.612

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

victim

1 white 0.551** 0.122

death_penalty

2 1 -2.872** 0.420

victim.death_penalty

3 white.1 1.058* 0.464

4 _cons 3.970** 0.097

----------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 131.79

Prob > chi2(4) = 0.0000

. *The coefficient for the key interaction (and its standard error) is exactly the same in the loglinear and the logistic formats. The models are the same, but the fit statistics are different.

. desmat: logistic death_penalty defendant victim [fweight=count]

----------------------------------------------------------------------------------

logistic

----------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 326

fweight: count

Initial log likelihood: -113.256

Log likelihood: -109.541

LR chi square: 7.431

Model degrees of freedom: 2

Pseudo R-squared: 0.033

Prob: 0.024

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

defendant

1 white -0.440 0.401

victim

2 white 1.324* 0.519

3 _cons -2.842** 0.420

----------------------------------------------------------------------------------

* p < .05

** p < .01

. lfit, table

Logistic model for death_penalty, goodness-of-fit test

+--------------------------------------------------------+

| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |

|-------+--------+-------+-------+-------+-------+-------|

| 1 | 0.0362 | 0 | 0.3 | 9 | 8.7 | 9 |

| 2 | 0.0551 | 6 | 5.7 | 97 | 97.3 | 103 |

| 3 | 0.1237 | 19 | 18.7 | 132 | 132.3 | 151 |

| 4 | 0.1798 | 11 | 11.3 | 52 | 51.7 | 63 |

+--------------------------------------------------------+

+------------------------------+

| Group | Prob | _x_1 | _x_2 |

|-------+--------+------+------|

| 1 | 0.0362 | 1 | 0 |

| 2 | 0.0551 | 0 | 0 |

| 3 | 0.1237 | 1 | 1 |

| 4 | 0.1798 | 0 | 1 |

+------------------------------+

number of observations = 326

number of covariate patterns = 4

Pearson chi2(1) = 0.38

Prob > chi2 = 0.5400

. *That above logistic regression corresponds to the all-2-way loglinear model.

. desmat: poisson count defendant*victim defendant*death_penalty victim*death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -18.191

LR chi square: 395.215

Model degrees of freedom: 6

Pseudo R-squared: 0.916

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

defendant

1 white -2.418** 0.348

victim

2 white -0.633** 0.171

defendant.victim

3 white.white 3.358** 0.382

death_penalty

4 1 -2.842** 0.420

defendant.death_penalty

5 white.1 -0.440 0.401

victim.death_penalty

6 white.1 1.324* 0.519

7 _cons 4.578** 0.101

----------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = .7006815

Prob > chi2(1) = 0.4026

. *This all 2-way model suggests that we could drop the defendant*death_penalty interaction, and have a nice model with good fit left, and of course it also seems to suggest, as our initial data analysis made us suspect, that race of defendant is not a significant factor in who gets the death penalty

. *The interactions with death penalty in the poisson model are the terms which are the direct effects of the independent variables in the logistic regression.

. *How can we see the interaction between victim's race and defendant's race in the logistic regressions?

. desmat: logistic death_penalty victim*defendant [fweight=count]

----------------------------------------------------------------------------------

logistic

----------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 326

fweight: count

Initial log likelihood: -113.256

Log likelihood: -109.191

LR chi square: 8.132

Model degrees of freedom: 3

Pseudo R-squared: 0.036

Prob: 0.043

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

victim

1 white 1.230* 0.536

defendant

2 white -15.490** 0.413

victim.defendant

3 white.white 15.105 .

4 _cons -2.783** 0.421

----------------------------------------------------------------------------------

* p < .05

** p < .01

. *Something weird starts to happen. Which is, the coefficients start to get too big, and unreliable. Why?

. *Because this logistic regression corresponds to the saturated model, and the data have a zero.

. desmat: poisson count defendant*victim*death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -17.841

LR chi square: 395.915

Model degrees of freedom: 7

Pseudo R-squared: 0.917

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

defendant

1 white -2.377** 0.348

victim

2 white -0.623** 0.172

defendant.victim

3 white.white 3.309** 0.385

death_penalty

4 1 -2.783** 0.421

defendant.death_penalty

5 white.1 -14.711 2096.899

victim.death_penalty

6 white.1 1.230* 0.536

defendant.victim.death_penalty

7 white.white.1 14.326 2096.899

8 _cons 4.575** 0.102

----------------------------------------------------------------------------------

* p < .05

** p < .01

. *Here you see the crazy SEs...

. *because we have a zero.

. *What to do?

. *Answer, from Leo Goodman: add something to every cell.

. *It might seem kind of sacriligeous to add something to the data, but in some sense if we want to see the saturated model, we have no choice.

. gen count_plus1=count+1

. desmat: poisson count_plus1 defendant*victim*death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count_plus1

Optimization: ml

Number of observations: 8

Initial log likelihood: -209.216

Log likelihood: -19.054

LR chi square: 380.324

Model degrees of freedom: 7

Pseudo R-squared: 0.909

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count_plus1

defendant

1 white -2.282** 0.332

victim

2 white -0.615** 0.171

defendant.victim

3 white.white 3.202** 0.370

death_penalty

4 1 -2.639** 0.391

defendant.death_penalty

5 white.1 0.336 1.119

victim.death_penalty

6 white.1 1.154* 0.505

defendant.victim.death_penalty

7 white.white.1 -0.746 1.189

8 _cons 4.585** 0.101

----------------------------------------------------------------------------------

* p < .05

** p < .01

. *What do we see? First of all, the defendant*death penalty is still insignificant. The 3-way is completely insignificant.

. poisgof

Goodness-of-fit chi2 = .0000109

Prob > chi2(0) = .

. desmat: logistic death_penalty defendant*victim

----------------------------------------------------------------------------------

logistic

----------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 8

Initial log likelihood: -5.545

Log likelihood: -5.545

LR chi square: 0.000

Model degrees of freedom: 3

Pseudo R-squared: 0.000

Prob: 1.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

defendant

1 white 0.000 2.000

victim

2 white 0.000 2.000

defendant.victim

3 white.white 0.000 2.828

4 _cons 0.000 1.414

----------------------------------------------------------------------------------

* p < .05

** p < .01

. desmat: logistic death_penalty defendant*victim [fweight=count]

----------------------------------------------------------------------------------

logistic

----------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 326

fweight: count

Initial log likelihood: -113.256

Log likelihood: -109.191

LR chi square: 8.132

Model degrees of freedom: 3

Pseudo R-squared: 0.036

Prob: 0.043

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

defendant

1 white -15.490** 0.413

victim

2 white 1.230* 0.536

defendant.victim

3 white.white 15.105 .

4 _cons -2.783** 0.421

----------------------------------------------------------------------------------

* p < .05

** p < .01

. desmat: logistic death_penalty defendant*victim [fweight= count_plus1]

----------------------------------------------------------------------------------

logistic

----------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 334

fweight: count_plus1

Initial log likelihood: -122.393

Log likelihood: -119.485

LR chi square: 5.816

Model degrees of freedom: 3

Pseudo R-squared: 0.024

Prob: 0.121

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

defendant

1 white 0.336 1.119

victim

2 white 1.154* 0.505

defendant.victim

3 white.white -0.746 1.189

4 _cons -2.639** 0.391

----------------------------------------------------------------------------------

* p < .05

** p < .01

. *these estimates are pretty much the same as with count, but here we get more stability

. *What would our best loglinear model look like?

. desmat: poisson count victim*defendant victim*death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -18.782

LR chi square: 394.033

Model degrees of freedom: 5

Pseudo R-squared: 0.913

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count

victim

1 white -0.588** 0.164

defendant

2 white -2.438** 0.348

victim.defendant

3 white.white 3.312** 0.379

death_penalty

4 1 -2.872** 0.420

victim.death_penalty

5 white.1 1.058* 0.464

6 _cons 4.580** 0.101

----------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.881837

Prob > chi2(2) = 0.3903

. *This fits well by the LRT, this v*d, v*p model

. *let's look at that same best model with countplus

. desmat: poisson count_plus1 victim*defendant victim*death_penalty

----------------------------------------------------------------------------------

Poisson regression

----------------------------------------------------------------------------------

Dependent variable count_plus1

Optimization: ml

Number of observations: 8

Initial log likelihood: -209.216

Log likelihood: -19.607

LR chi square: 379.218

Model degrees of freedom: 5

Pseudo R-squared: 0.906

Prob: 0.000

----------------------------------------------------------------------------------

nr Effect Coeff s.e.

----------------------------------------------------------------------------------

count_plus1

victim

1 white -0.567** 0.162

defendant

2 white -2.256** 0.317

victim.defendant

3 white.white 3.112** 0.350

death_penalty

4 1 -2.603** 0.366

victim.death_penalty

5 white.1 0.843* 0.413

6 _cons 4.583** 0.101

----------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.105725

Prob > chi2(2) = 0.5753

. *The significance level and direction of the key interactions is the same, but the values are a little different....

. *The one zero only became a problem when were doing the saturated model. Otherwise, the zero was hidden by combination with other things.

. exit, clear