A Bayesian longitudinal trend analysis of count data with Gaussian processes

Biom J. 2022 Jan;64(1):74-90. doi: 10.1002/bimj.202000298. Epub 2021 Sep 1.

Abstract

The context of comparing two different groups of subjects that are measured repeatedly over time is considered. Our specific focus is on highly variable count data which have a nonnegligible frequency of zeros and have time trends that are difficult to characterize. These challenges are often present when analyzing bacteria or gene expression data sets. Traditional longitudinal data analysis methods, including generalized estimating equations, can be challenged by the features present in these types of data sets. We propose a Bayesian methodology that effectively confronts these challenges. A key feature of the methodology is the use of Gaussian processes to flexibly model the time trends. Inference procedures based on both sharp and interval null hypotheses are discussed, including for the important hypotheses that test for group differences at individual time points. The proposed methodology is illustrated with next-generation sequencing (NGS) data sets corresponding to two different experimental conditions. In particular, the method is applied to a case study containing bacteria counts of mice with chronic and nonchronic wounds to identify potential wound-healing probiotics. The methodology can be applied to similar NGS data sets comparing two groups of subjects.

Keywords: Markov chain Monte Carlo; NGS data; interval null hypothesis; longitudinal.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Bayes Theorem
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Markov Chains
  • Mice
  • Monte Carlo Method
  • Normal Distribution