Selection Probability for Rare Variant Association Studies

J Comput Biol. 2017 May;24(5):400-411. doi: 10.1089/cmb.2016.0222. Epub 2017 Mar 10.

Abstract

In human genome research, genetic association studies of rare variants have been widely studied since the advent of high-throughput DNA sequencing platforms. However, detection of outcome-related rare variants still remains a statistically challenging problem because the number of observed genetic mutations is extremely rare. Recently, a power set-based statistical selection procedure has been proposed to locate both risk and protective rare variants within the outcome-related genes or genetic regions. Although it can perform an individual selection of rare variants, the procedure has a limitation that it cannot measure the certainty of selected rare variants. In this article, we propose a selection probability of individual rare variants, where selection frequencies of rare variants are computed based on bootstrap resampling. Therefore, it can quantify the certainty of both selected and unselected rare variants. Also, a new selection approach using a threshold of selection probability is introduced and compared with some existing selection procedures from extensive simulation studies and real sequencing data analysis. We have demonstrated that the proposed approach outperforms the existing methods in terms of a selection power.

Keywords: genetic association study; rare variant; selection probability; sequencing data.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Genetic Association Studies / methods*
  • Genetic Variation*
  • Genome, Human
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Mutation Rate
  • Sequence Analysis, DNA