MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing

Zihao Zheng; Aisha M Mergaert; Irene M Ong; Miriam A Shelef; Michael A Newton

doi:10.1093/bioinformatics/btab162

MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing

Bioinformatics. 2021 Sep 9;37(17):2637-2643. doi: 10.1093/bioinformatics/btab162.

Authors

Zihao Zheng^{1

2}, Aisha M Mergaert^{2

3}, Irene M Ong^{4

5

6}, Miriam A Shelef^{2

7}, Michael A Newton^{1

4

6}

Affiliations

¹ Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA.
² Department of Medicine, University of Wisconsin-Madison, Madison, WI 53705, USA.
³ Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI 53705-2281, USA.
⁴ Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, USA.
⁵ Department of Obstetrics and Gynecology, University of Wisconsin-Madison, Madison, WI 53792, USA.
⁶ University of Wisconsin Carbone Comprehensive Cancer Center, University of Wisconsin-Madison, Madison, WI 53792, USA.
⁷ William S. Middleton Memorial Veterans Hospital, Madison, WI 53705, USA.

Abstract

Summary: Peptide microarrays have emerged as a powerful technology in immunoproteomics as they provide a tool to measure the abundance of different antibodies in patient serum samples. The high dimensionality and small sample size of many experiments challenge conventional statistical approaches, including those aiming to control the false discovery rate (FDR). Motivated by limitations in reproducibility and power of current methods, we advance an empirical Bayesian tool that computes local FDR statistics and local false sign rate statistics when provided with data on estimated effects and estimated standard errors from all the measured peptides. As the name suggests, the MixTwice tool involves the estimation of two mixing distributions, one on underlying effects and one on underlying variance parameters. Constrained optimization techniques provide for model fitting of mixing distributions under weak shape constraints (unimodality of the effect distribution). Numerical experiments show that MixTwice can accurately estimate generative parameters and powerfully identify non-null peptides. In a peptide array study of rheumatoid arthritis, MixTwice recovers meaningful peptide markers in one case where the signal is weak, and has strong reproducibility properties in one case where the signal is strong.

Availabilityand implementation: MixTwice is available as an R software package https://cran.r-project.org/web/packages/MixTwice/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Abstract

Grants and funding