Interpretable modeling of genotype-phenotype landscapes with state-of-the-art predictive power

Peter D Tonner; Abe Pressman; David Ross

doi:10.1073/pnas.2114021119

Interpretable modeling of genotype-phenotype landscapes with state-of-the-art predictive power

Proc Natl Acad Sci U S A. 2022 Jun 28;119(26):e2114021119. doi: 10.1073/pnas.2114021119. Epub 2022 Jun 21.

Authors

Peter D Tonner¹, Abe Pressman², David Ross²

Affiliations

¹ Statistical Engineering Division, National Institute of Standards and Technology, Gaithersburg, MD 20899.
² Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD 20899.

Abstract

Large-scale measurements linking genetic background to biological function have driven a need for models that can incorporate these data for reliable predictions and insight into the underlying biophysical system. Recent modeling efforts, however, prioritize predictive accuracy at the expense of model interpretability. Here, we present LANTERN (landscape interpretable nonparametric model, https://github.com/usnistgov/lantern), a hierarchical Bayesian model that distills genotype-phenotype landscape (GPL) measurements into a low-dimensional feature space that represents the fundamental biological mechanisms of the system while also enabling straightforward, explainable predictions. Across a benchmark of large-scale datasets, LANTERN equals or outperforms all alternative approaches, including deep neural networks. LANTERN furthermore extracts useful insights of the landscape, including its inherent dimensionality, a latent space of additive mutational effects, and metrics of landscape structure. LANTERN facilitates straightforward discovery of fundamental mechanisms in GPLs, while also reliably extrapolating to unexplored regions of genotypic space.

Keywords: epistasis; genotype–phenotype landscape; interpretability; machine learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bayes Theorem
Gene-Environment Interaction*
Genotype*
Mutation
Neural Networks, Computer*
Phenotype*

Associated data

figshare/10.6084/m9.figshare.3102154