SimPEL: Simulation-based power estimation for sequencing studies of low-prevalence conditions

Genet Epidemiol. 2018 Jul;42(5):480-487. doi: 10.1002/gepi.22129. Epub 2018 May 22.

Abstract

Power estimations are important for optimizing genotype-phenotype association study designs. However, existing frameworks are designed for common disorders, and thus ill-suited for the inherent challenges of studies for low-prevalence conditions such as rare diseases and infrequent adverse drug reactions. These challenges include small sample sizes and the need to leverage genetic annotation resources in association analyses for the purpose of ranking potential causal genes. We present SimPEL, a simulation-based program providing power estimations for the design of low-prevalence condition studies. SimPEL integrates the usage of gene annotation resources for association analyses. Customizable parameters, including the penetrance of the putative causal allele and the employed pathogenic scoring system, allow SimPEL to realistically model a large range of study designs. To demonstrate the effects of various parameters on power, we estimated the power of several simulated designs using SimPEL and captured power trends in agreement with observations from current literature on low-frequency condition studies. SimPEL, as a tool, provides researchers studying low-frequency conditions with an intuitive and highly flexible avenue for statistical power estimation. The platform-independent "batteries included" executable and default input files are available at https://github.com/precisionomics/SimPEL.

Keywords: adverse drug reactions; association analyses; genetic variant annotation; genome-wide sequencing; power estimation; rare disease.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Computer Simulation*
  • Genetic Association Studies
  • Genome-Wide Association Study
  • Humans
  • Models, Genetic*
  • Penetrance
  • Prevalence
  • Sample Size
  • Sequence Analysis, DNA*