-------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2012_381_logs\c

> lass5.log

  log type:  text

 opened on:   9 Oct 2012, 13:30:28

 

. use "C:\Users\Michael\Desktop\cps_mar_2000_new_unchanged.dta", clear

 

 

. table sex if age>24 & age<35, contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |    13.31212     2.967666        9,027

   Female |    13.55657     2.854472        9,511

 

* all the t-tests below and the regression coefficients and their t-statistics are based entirely on mean, SD, and N of 2 samples.

 

. ttest yrsed if age>24 & age<35, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. ttest yrsed if age>24 & age<35, by(sex) unequal

 

Two-sample t test with unequal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0428057                 -.32835   -.1605438

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7106

Ho: diff = 0                     Satterthwaite's degrees of freedom =  18383.6

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

* for education, unequal and equal variance t-test are similar because the variances of the 2 subsamples are so similar to begin with.

 

. gen months_ed=yrsed*12

(30484 missing values generated)

 

. ttest months_ed if age>24 & age<35, by(sex) unequal

 

Two-sample t test with unequal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    159.7454    .3748215    35.61199    159.0107    160.4802

  Female |    9511    162.6788    .3512319    34.25366    161.9903    163.3673

---------+--------------------------------------------------------------------

combined |   18538    161.2504    .2567052    34.95152    160.7472    161.7536

---------+--------------------------------------------------------------------

    diff |           -2.933363    .5136682                 -3.9402   -1.926525

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7106

Ho: diff = 0                     Satterthwaite's degrees of freedom =  18383.6

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

*note the effect of change of scale on mean and SD, but not on the T-statistic which is unit free.

 

. tabulate sex male

 

           |         male

       Sex |         0          1 |     Total

-----------+----------------------+----------

      Male |         0     64,791 |    64,791

    Female |    68,919          0 |    68,919

-----------+----------------------+----------

     Total |    68,919     64,791 |   133,710

 

* generate a dummy variable for gender.

 

 

. ttest months_ed if age>24 & age<35, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    159.7454    .3748215    35.61199    159.0107    160.4802

  Female |    9511    162.6788    .3512319    34.25366    161.9903    163.3673

---------+--------------------------------------------------------------------

combined |   18538    161.2504    .2567052    34.95152    160.7472    161.7536

---------+--------------------------------------------------------------------

    diff |           -2.933363    .5131471               -3.939178   -1.927547

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

* regression is the same as the equal variance t-test.

 

. regress yrsed male if age>24 & age<35

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   32.68

       Model |  276.742433     1  276.742433           Prob > F      =  0.0000

    Residual |  156979.922 18536  8.46892111           R-squared     =  0.0018

-------------+------------------------------           Adj R-squared =  0.0017

       Total |  157256.664 18537  8.48339343           Root MSE      =  2.9101

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.2444469   .0427623    -5.72   0.000    -.3282649   -.1606289

       _cons |   13.55657   .0298401   454.31   0.000     13.49808    13.61506

------------------------------------------------------------------------------

 

. ttest yrsed if age>24 & age<35, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. table sex if age>24 & age<35, contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |    13.31212     2.967666        9,027

   Female |    13.55657     2.854472        9,511

 

*Now weight by analytic weights, yielding the same sample size, but slightly different mean and SD

 

. table sex if age>24 & age<35 [aweight= perwt_rounded], contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |     13.5574     2.819247        9,027

   Female |    13.76295     2.720855        9,511

-------------------------------------------------

 

. regress yrsed male if age>24 & age<35 [aweight= perwt_rounded]

(sum of wgt is   3.7786e+07)

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   25.52

       Model |  195.741395     1  195.741395           Prob > F      =  0.0000

    Residual |  142186.809 18536  7.67084641           R-squared     =  0.0014

-------------+------------------------------           Adj R-squared =  0.0013

       Total |  142382.551 18537   7.6809921           Root MSE      =  2.7696

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.2055446   .0406899    -5.05   0.000    -.2853005   -.1257887

       _cons |   13.76294   .0285199   482.57   0.000     13.70704    13.81885

------------------------------------------------------------------------------

* regression with aweights is similar, but not exactly the same as unweighted regression.

 

 

. table sex if age>24 & age<35 [aweight= perwt_rounded], contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |     13.5574     2.819247        9,027

   Female |    13.76295     2.720855        9,511

-------------------------------------------------

 

* perwt gives the same mean as aweight, but multiplies the N by about 2000

 

. table sex if age>24 & age<35 [fweight= perwt_rounded], contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |     13.5574     2.819091     1.86e+07

   Female |    13.76295     2.720712     1.92e+07

-------------------------------------------------

 

. regress yrsed male if age>24 & age<35 [fweight= perwt_rounded]

 

      Source |       SS       df       MS              Number of obs =37785945

-------------+------------------------------           F(  1,37785943) =52018.00

       Model |  398979.047     1  398979.047           Prob > F      =  0.0000

    Residual |   28981891037785943  7.67001924           R-squared     =  0.0014

-------------+------------------------------           Adj R-squared =  0.0014

       Total |   29021788937785944  7.68057796           Root MSE      =  2.7695

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.2055446   .0009012  -228.07   0.000    -.2073109   -.2037782

       _cons |   13.76294   .0006317  2.2e+04   0.000     13.76171    13.76418

------------------------------------------------------------------------------

 

* fweighted regression yields a T statistic larger by sqrt(2000), or about 43 times larger, and totally unrealistic and unreasonable.

 

. regress yrsed male if age>24 & age<35 [aweight= perwt_rounded]

(sum of wgt is   3.7786e+07)

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   25.52

       Model |  195.741395     1  195.741395           Prob > F      =  0.0000

    Residual |  142186.809 18536  7.67084641           R-squared     =  0.0014

-------------+------------------------------           Adj R-squared =  0.0013

       Total |  142382.551 18537   7.6809921           Root MSE      =  2.7696

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.2055446   .0406899    -5.05   0.000    -.2853005   -.1257887

       _cons |   13.76294   .0285199   482.57   0.000     13.70704    13.81885

------------------------------------------------------------------------------

 

. table sex if age>24 & age<35 [aweight= perwt_rounded], contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |     13.5574     2.819247        9,027

   Female |    13.76295     2.720855        9,511

-------------------------------------------------

 

. gen random_uniform_2=uniform()

 

* generate a uniform random variable.

 

. summarize  random_uniform_2

 

    Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

random_uni~2 |    133710    .5006203    .2884588   .0000219   .9999971

 

* use that uniform random variable to reduce sample size to ¼ the prior size; note that means and SDs change a little bit, because of randomness.

 

. table sex if age>24 & age<35 &  random_uniform_2 <=.25 [aweight= perwt_rounded], contents (mean yrsed sd yrsed freq)

 

-------------------------------------------------

      Sex | mean(yrsed)    sd(yrsed)        Freq.

----------+--------------------------------------

     Male |    13.55302     2.804218        2,248

   Female |    13.72653     2.718835        2,330

-------------------------------------------------

 

. regress yrsed male if age>24 & age<35 &  random_uniform_2<=.25 [aweight= perwt_rounded]

(sum of wgt is   9.2846e+06)

 

      Source |       SS       df       MS              Number of obs =    4578

-------------+------------------------------           F(  1,  4576) =    4.52

       Model |  34.4468815     1  34.4468815           Prob > F      =  0.0336

    Residual |  34890.4634  4576  7.62466419           R-squared     =  0.0010

-------------+------------------------------           Adj R-squared =  0.0008

       Total |  34924.9102  4577  7.63052441           Root MSE      =  2.7613

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.1735029   .0816285    -2.13   0.034    -.3335342   -.0134715

       _cons |   13.72653    .057329   239.43   0.000     13.61413    13.83892

------------------------------------------------------------------------------

 

* T-statistic roughly one half as large i.e., sqrt(1/4) times as large as before.

 

. regress yrsed male if age>24 & age<35  [aweight= perwt_rounded]

(sum of wgt is   3.7786e+07)

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   25.52

       Model |  195.741395     1  195.741395           Prob > F      =  0.0000

    Residual |  142186.809 18536  7.67084641           R-squared     =  0.0014

-------------+------------------------------           Adj R-squared =  0.0013

       Total |  142382.551 18537   7.6809921           Root MSE      =  2.7696

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.2055446   .0406899    -5.05   0.000    -.2853005   -.1257887

       _cons |   13.76294   .0285199   482.57   0.000     13.70704    13.81885

------------------------------------------------------------------------------

 

. log close

      name:  <unnamed>

       log:  C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2012_381

> _logs\class5.log

  log type:  text

 closed on:   9 Oct 2012, 15:47:14

------------------------------------------------------------------------------------