Code from previous and current projects

 

 

All code is provided without any warranty or support.  If you use any of my code in your research, I kindly ask that you cite the respective paper. 

 


 

 

 

"Correcting for Cross-sectional and Time-Series Dependence in Accounting Research" [paper]

  • Panel Data Simulations in Matlab:  Replicate Tables 1 and 2 of the paper [code]

  • Code to implement two-way cluster-robust std. errors: 

     

    • Stata: Mitchell Petersen has posted code for OLS, probit, logit, and tobit. In the paper we use a version of the OLS routine which can accommodate a large number of fixed effects [code].  It is relatively straightforward to modify Mitchell Petersen's code to produce two-way cluster-robust std. errors for other regression techniques (e.g., ordered logit [code]).  

     

    • SAS: This code relies on SAS's built in one-way cluster function SURVEYREG and can be computationally intensive on large datasets.  The procedure cannot handle large sets of dummy variables/fixed effects, as these often result in non-invertible covariance matrices. The above Stata code is less computationally intensive. [code]

     

    • Matlab: This code estimates one-way and two-way cluster-robust std. errors [code]

     

    • Addendum: Ian Gow has posted test data to illustrate the use of two-way cluster-robust std. errors in Stata, SAS, and Matlab [test data]

     

    • Bootstrapping: Methods with asymptotic foundations generally tend to perform poorly in small samples. A straightforward way to correct for this is to use bootstrapping. One can compute two-way cluster robust standard errors using the cluster bootstrapping technique [code]. The syntax is as follows:

cluster2boot dependent_variable independent_variables,  fcluster(cluster_variable_one) tcluster(cluster_variable_two) nboot(number of bootstrap iterations)

It is relatively straightforward to modify this code to apply to other estimation techniques. An added advantage of using the cluster-bootstrap technique is that it allows the user to compute two-way cluster-robust standard errors for estimation techniques that do not have a built-in "cluster" option in Stata.  

 

 

 

 

 

 

 

 


 

General Code/Utilities

 


 

 

 

 

 


 

SAS and STATA Notes

 


  • Robust regression in STATA (rreg) and SAS (proc robustreg) are almost identical procedures. However, it appears that the former removes observations with Cook's D greater than 1 whereas the the latter retains them.

     

  • In STATA you can use the command    xi: regress Y X i.Z     to estimate a model with a large number of fixed effects based on the values of Z.  For example, let SIC2 contain the two-digit industry code, then   xi: regress Y X i.SIC2   will create and estimate a model with industry fixed effects

     

  • One can bootstrap any regression command in Stata using the code:  bootstrap "{insert command}" _b

     

  • One can compute one-way cluster robust standard errors for STATA commands that do not have a built in cluster option using the code:    bootstrap "{insert command}" _b, cluster( {insert clusterID} )

     

  • The common approach of estimating Newey-West corrected standard errors is SAS using the proc model statement with a Bartlett kernel is (a) sensitive to the ordering of the observations for time-series data and (b) does not produce Newey-West standard errors for panel data.  These problems arise because unlike STATA's tsset command, the proc model statement does not allow the user to specify a time index, causing SAS to treat the panel as a single time-series.