I was a little irked to see this again especially in the context of Felix Salmon's post on When Can You Trust Economics Papers? My response would be never.
This is a reprise of an earlier New Economist post and my own comments. From New Economist:
In my naivety I had assumed most social scientists understood how to model interaction effects. Then I read this post by Omar on orgtheory.net, and realised maybe I was wrong:
So we are agreed interaction models are awesome. However, as your stats 101 teacher told you, you have to be careful about two things: (1) never omit the main effects. Thus you don’t test hypothesis 1 using any of these specifications:
or Alanis forbid:
And (2) when interpreting b1 and b2 in the fully specified model, remember that those effects are conditional on the value of the other variables. b1 is now the effect of X on Y when Z=0 and b2 is now the effect of Z on Y when X=0. If your variables don’t have a meaningful zero point (like a racial attitudes scale), center them at their mean so that you can say “b2 is the effect of being Southern on voting republican for those who have average levels of racial animus towards blacks.”
Seems simple. Everybody knows this. Why am I even explaining this to you? Well, as noted by Brambor, Clark and Golder (2006) in a recent article in Political Analysis, a survey of 156 articles published in the major Political Science journals shows that only 10% of researchers specified their interaction models correctly. A large chunk of them outright omitted main effects, which can lead to incorrect significance tests of the interaction term. In some of these articles the entire contribution was riding on the interaction term. So things are not so simple. Consider the horror:
In an award-winning article in the American Political Science Review, Boix (1999) examines the factors that determine electoral system choice in advanced democracies. He makes two main conclusions. First, ethnic or religious fragmentation encourages the adoption of proportional representation in small and medium-sized countries (621). He draws this conclusion based on a model that includes an interaction term between ethnoreligious fragmentation and country size. However, he does not include either of the constitutive terms. When these terms are included, there is no longer any evidence that ethno-religious fragmentation ever affects the adoption of proportional representation (italics added).
You should read the article to see other horror stories. The lesson: if your dissertation/paper is riding on an interaction effect, don’t be a fool. Estimate a fully specified model.
It looks as though the Daylight Savings Time paper by Grant and Kotchen did not do this (or at the very least, if they did, it is not at all obvious that they did). See their Equation 2, Table 4 and 5. The treatment variable needs to be included as a main effect which they did not. Their conclusions rest entirely on the interaction of the treatment with year.