Replicability in cancer omics data analysis: measures and empirical explorations

Brief Bioinform. 2022 Sep 20;23(5):bbac304. doi: 10.1093/bib/bbac304.

Abstract

In biomedical research, the replicability of findings across studies is highly desired. In this study, we focus on cancer omics data, for which the examination of replicability has been mostly focused on important omics variables identified in different studies. In published literature, although there have been extensive attention and ad hoc discussions, there is insufficient quantitative research looking into replicability measures and their properties. The goal of this study is to fill this important knowledge gap. In particular, we consider three sensible replicability measures, for which we examine distributional properties and develop a way of making inference. Applying them to three The Cancer Genome Atlas (TCGA) datasets reveals in general low replicability and significant across-data variations. To further comprehend such findings, we resort to simulation, which confirms the validity of the findings with the TCGA data and further informs the dependence of replicability on signal level (or equivalently sample size). Overall, this study can advance our understanding of replicability for cancer omics and other studies that have identification as a key goal.

Keywords: cancer omics data analysis; quantitative properties; replicability measure.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biomedical Research*
  • Humans
  • Neoplasms* / genetics
  • Sample Size