A Bayesian hierarchical model for signal extraction from protein microarrays

Stat Med. 2023 Apr 30;42(9):1445-1460. doi: 10.1002/sim.9680. Epub 2023 Mar 5.

Abstract

Protein microarrays are a promising technology that measure protein levels in serum or plasma samples. Due to their high technical variability and high variation in protein levels across serum samples in any population, directly answering biological questions of interest using protein microarray measurements is challenging. Analyzing preprocessed data and within-sample ranks of protein levels can mitigate the impact of between-sample variation. As for any analysis, ranks are sensitive to preprocessing, but loss function based ranks that accommodate major structural relations and components of uncertainty are very effective. Bayesian modeling with full posterior distributions for quantities of interest produce the most effective ranks. Such Bayesian models have been developed for other assays, for example, DNA microarrays, but modeling assumptions for these assays are not appropriate for protein microarrays. Consequently, we develop and evaluate a Bayesian model to extract the full posterior distribution of normalized protein levels and associated ranks for protein microarrays, and show that it fits well to data from two studies that use protein microarrays produced by different manufacturing processes. We validate the model via simulation and demonstrate the downstream impact of using estimates from this model to obtain optimal ranks.

Keywords: Bayesian models; bioinformatics; optimal ranks; preprocessing; protein microarrays.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Protein Array Analysis*