However, I have to admit that even after getting a PhD I was not pointed to this type of error until I started working - by statisticians. One analyst (another PhD economist) proposed estimating an equation with interaction terms but without including all of the variables as main (or level) variables. The statistician on the project had to point out that this is not correct. I can now see why this is the case. For instance, let's say we want to estimate:
Y = a + bX1 + cX2 + dX1X2
1. From an ANOVA standpoint there is no reason to exclude X1 and X2 separately (one or both) and just include X1X2.2. Leaving out one of the main effects (or level variables), for instance, X2 is tantamount to assuming/imposing the restriction c = 0. There is no a priori reason to do this. Econometrics lets us test this restriction and there really is no harm to keeping it in.
3. Leaving out one variable is similar to doing model selection by dropping insignificant variables but in this case the authors do not test that this is the case. In any case, even if a variable is not statistically significant there is still no good reason to drop the variable in these types of analyses.
4. At most analysts should consider including the variable as a main effect as part of sensitivity analysis (even if they do not believe that the variable should be included as a main effect).
In their paper, Kotchen and Grant focus on the coefficient of the interaction, d, in this case which they use to support their claim that DST increases energy usage. My guess is that if they were to estimate the model correctly, the size of the coefficient, d, would fall. Right now their estimates of d are partially capturing the effects of the omitted variable. I suppose the other possibility is that including all the relevant variables as main effects could have resulted in some perfect collinearity although they don't indicate this is the case.
No comments:
Post a Comment