Dealing with Confounders in Omics Analysis

Trends Biotechnol. 2018 May;36(5):488-498. doi: 10.1016/j.tibtech.2018.01.013. Epub 2018 Feb 20.

Abstract

The Anna Karenina effect is a manifestation of the theory-practice gap that exists when theoretical statistics are applied on real-world data. In the course of analyzing biological data for differential features such as genes or proteins, it derives from the situation where the null hypothesis is rejected for extraneous reasons (or confounders), rather than because the alternative hypothesis is relevant to the disease phenotype. The mechanics of applying statistical tests therefore must address and resolve confounders. It is inadequate to simply rely on manipulating the P-value. We discuss three mechanistic elements (hypothesis statement construction, null distribution appropriateness, and test-statistic construction) and suggest how they can be designed to foil the Anna Karenina effect to select phenotypically relevant biological features.

Keywords: Statistics; biomarker; feature selection; generalizability; reproducibility.

Publication types

  • Review

MeSH terms

  • Biomarkers / analysis*
  • Biostatistics / methods*
  • Electronic Data Processing / methods*

Substances

  • Biomarkers