Marginal likelihood estimation of negative binomial parameters with applications to RNA-seq data

Biostatistics. 2017 Oct 1;18(4):637-650. doi: 10.1093/biostatistics/kxx006.

Abstract

RNA-Seq data characteristically exhibits large variances, which need to be appropriately accounted for in any proposed model. We first explore the effects of this variability on the maximum likelihood estimator (MLE) of the dispersion parameter of the negative binomial distribution, and propose instead to use an estimator obtained via maximization of the marginal likelihood in a conjugate Bayesian framework. We show, via simulation studies, that the marginal MLE can better control this variation and produce a more stable and reliable estimator. We then formulate a conjugate Bayesian hierarchical model, and use this new estimator to propose a Bayesian hypothesis test to detect differentially expressed genes in RNA-Seq data. We use numerical studies to show that our much simpler approach is competitive with other negative binomial based procedures, and we use a real data set to illustrate the implementation and flexibility of the procedure.

Keywords: Bayesian methods; DEG; Hierarchical models; Hypothesis testing; Maximum likelihood estimation; Model selection; Negative binomial; RNA-Seq analysis.

MeSH terms

  • Binomial Distribution*
  • Humans
  • Likelihood Functions*
  • Sequence Analysis, RNA / methods*