RSim: A reference-based normalization method via rank similarity

PLoS Comput Biol. 2023 Sep 1;19(9):e1011447. doi: 10.1371/journal.pcbi.1011447. eCollection 2023 Sep.

Abstract

Microbiome sequencing data normalization is crucial for eliminating technical bias and ensuring accurate downstream analysis. However, this process can be challenging due to the high frequency of zero counts in microbiome data. We propose a novel reference-based normalization method called normalization via rank similarity (RSim) that corrects sample-specific biases, even in the presence of many zero counts. Unlike other normalization methods, RSim does not require additional assumptions or treatments for the high prevalence of zero counts. This makes it robust and minimizes potential bias resulting from procedures that address zero counts, such as pseudo-counts. Our numerical experiments demonstrate that RSim reduces false discoveries, improves detection power, and reveals true biological signals in downstream tasks such as PCoA plotting, association analysis, and differential abundance analysis.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology* / methods
  • Microbiota*

Grants and funding

BY and SW are funded by National Science Foundation (DMS-2113458). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.