Transfer Learning in Multiple Hypothesis Testing

Stefano Cabras; María Eugenia Castellanos Nueda

doi:10.3390/e26010049

Transfer Learning in Multiple Hypothesis Testing

Entropy (Basel). 2024 Jan 4;26(1):49. doi: 10.3390/e26010049.

Authors

Stefano Cabras¹, María Eugenia Castellanos Nueda²

Affiliations

¹ Department of Statistics, University Carlos III of Madrid, 28903 Madrid, Spain.
² Department of Informatics and Statistics, Rey Juan Carlos University, 28933 Mostoles, Spain.

PMID: 38248175
DOI: 10.3390/e26010049

Abstract

In this investigation, a synthesis of Convolutional Neural Networks (CNNs) and Bayesian inference is presented, leading to a novel approach to the problem of Multiple Hypothesis Testing (MHT). Diverging from traditional paradigms, this study introduces a sequence-based uncalibrated Bayes factor approach to test many hypotheses using the same family of sampling parametric models. A two-step methodology is employed: initially, a learning phase is conducted utilizing simulated datasets encompassing a wide spectrum of null and alternative hypotheses, followed by a transfer phase applying this fitted model to real-world experimental sequences. The outcome is a CNN model capable of navigating the complex domain of MHT with improved precision over traditional methods, also demonstrating robustness under varying conditions, including the number of true nulls and dependencies between tests. Although indications of empirical evaluations are presented and show that the methodology will prove useful, more work is required to provide a full evaluation from a theoretical perspective. The potential of this innovative approach is further illustrated within the critical domain of genomics. Although formal proof of the consistency of the model remains elusive due to the inherent complexity of the algorithms, this paper also provides some theoretical insights and advocates for continued exploration of this methodology.

Keywords: RNA-seq experiments; bayes factors; deep learning; improper priors; objective bayesian inference; random sequences.

Grants and funding

PID2022-138201NB-I00/Ministerio de Ciencia e Innovación