Identifying differential genes over conditions provides insights into the mechanisms of biological processes and disease progression. Here we present an approach, the Kullback-Leibler divergence-based differential distribution (klDD), which provides a flexible framework for quantifying changes in higher-order statistical information of genes including mean and variance/covariation. The method can well detect subtle differences in gene expression distributions in contrast to mean or variance shifts of the existing methods. In addition to effectively identifying informational genes in terms of differential distribution, klDD can be directly applied to cancer subtyping, single-cell clustering and disease early-warning detection, which were all validated by various benchmark datasets.
Keywords: cancer subtyping; differential distribution genes; disease early-warning signals; gene expression; potential disease modules; single-cell clustering.
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.