Inferring the Effects of Protein Variants on Protein-Protein Interactions with Interpretable Transformer Representations

Zhe Liu; Wei Qian; Wenxiang Cai; Weichen Song; Weidi Wang; Dhruba Tara Maharjan; Wenhong Cheng; Jue Chen; Han Wang; Dong Xu; Guan Ning Lin

doi:10.34133/research.0219

Inferring the Effects of Protein Variants on Protein-Protein Interactions with Interpretable Transformer Representations

Research (Wash D C). 2023 Sep 11:6:0219. doi: 10.34133/research.0219. eCollection 2023.

Authors

Zhe Liu¹, Wei Qian¹, Wenxiang Cai¹, Weichen Song¹, Weidi Wang^{1

2}, Dhruba Tara Maharjan¹, Wenhong Cheng¹, Jue Chen¹, Han Wang³, Dong Xu^{4

5}, Guan Ning Lin^{1

2}

Affiliations

¹ Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
² Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China.
³ School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
⁴ Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
⁵ Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

Abstract

Identifying pathogenetic variants and inferring their impact on protein-protein interactions sheds light on their functional consequences on diseases. Limited by the availability of experimental data on the consequences of protein interaction, most existing methods focus on building models to predict changes in protein binding affinity. Here, we introduced MIPPI, an end-to-end, interpretable transformer-based deep learning model that learns features directly from sequences by leveraging the interaction data from IMEx. MIPPI was specifically trained to determine the types of variant impact (increasing, decreasing, disrupting, and no effect) on protein-protein interactions. We demonstrate the accuracy of MIPPI and provide interpretation through the analysis of learned attention weights, which exhibit correlations with the amino acids interacting with the variant. Moreover, we showed the practicality of MIPPI in prioritizing de novo mutations associated with complex neurodevelopmental disorders and the potential to determine the pathogenic and driving mutations. Finally, we experimentally validated the functional impact of several variants identified in patients with such disorders. Overall, MIPPI emerges as a versatile, robust, and interpretable model, capable of effectively predicting mutation impacts on protein-protein interactions and facilitating the discovery of clinically actionable variants.