Tuesday, April 12, 2011

Multiple comparisons

Wikipedia describes this problem as the following:

In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously.[1] Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly reject the null hypothesis are more likely to occur when one considers the set as a whole. Several statistical techniques have been developed to prevent this from happening, allowing significance levels for single and multiple comparisons to be directly compared. These techniques generally require a stronger level of evidence to be observed in order for an individual comparison to be deemed "significant", so as to compensate for the number of inferences being made.

Some examples (again from Wikipedia):
  • Suppose the treatment is a new way of teaching writing to students, and the control is the standard way of teaching writing. Students in the two groups can be compared in terms of grammar, spelling, organization, content, and so on. As more attributes are compared, it becomes more likely that the treatment and control groups will appear to differ on at least one attribute.
  • Suppose we consider the efficacy of a drug in terms of the reduction of any one of a number of disease symptoms. As more symptoms are considered, it becomes more likely that the drug will appear to be an improvement over existing drugs in terms of at least one symptom.
  • Suppose we consider the safety of a drug in terms of the occurrences of different types of side effects. As more types of side effects are considered, it becomes more likely that the new drug will appear to be less safe than existing drugs in terms of at least one side effect.

The question I have is the following: What if I refuse to test for some of the additional attributes/symptoms or hypotheses? Does this make the multiple comparison problem go away? If I ignore the fact that there are possibly other hypotheses out there which I do not specifically test for does this make my results more valid in the sense that I do not have to adjust for the confidence interval or level of significance? What if all studies did this - just focus on the hypothesis of interest and ignore all other testable hypothesis within their model or experiment?

No comments: