A graph-based machine learning framework identifies critical properties of FVIII that lead to hemophilia A

Front Bioinform. 2023 May 10:3:1152039. doi: 10.3389/fbinf.2023.1152039. eCollection 2023.

Abstract

Introduction: Blood coagulation is an essential process to cease bleeding in humans and other species. This mechanism is characterized by a molecular cascade of more than a dozen components activated after an injury to a blood vessel. In this process, the coagulation factor VIII (FVIII) is a master regulator, enhancing the activity of other components by thousands of times. In this sense, it is unsurprising that even single amino acid substitutions result in hemophilia A (HA)-a disease marked by uncontrolled bleeding and that leaves patients at permanent risk of hemorrhagic complications. Methods: Despite recent advances in the diagnosis and treatment of HA, the precise role of each residue of the FVIII protein remains unclear. In this study, we developed a graph-based machine learning framework that explores in detail the network formed by the residues of the FVIII protein, where each residue is a node, and two nodes are connected if they are in close proximity on the FVIII 3D structure. Results: Using this system, we identified the properties that lead to severe and mild forms of the disease. Finally, in an effort to advance the development of novel recombinant therapeutic FVIII proteins, we adapted our framework to predict the activity and expression of more than 300 in vitro alanine mutations, once more observing a close agreement between the in silico and the in vitro results. Discussion: Together, the results derived from this study demonstrate how graph-based classifiers can leverage the diagnostic and treatment of a rare disease.

Keywords: FVII; FVIIIa; bioinformatics; graph neural network; machine learning; protein structure; residue network.

Grants and funding

TL was supported by the Council for Science, Technology and Innovation (CSTI), Cross-ministerial Strategic Innovation Promotion Program (SIP), “Innovative AI Hospital System,” the National Institute of Biomedical Innovation, Health and Nutrition (NIBIOHN) [grant number SIPAIH20D01], JSPS KAKENHI [JP22K06119] and the National Center for Child Health and Development internal grant [2022B-2]. RR was supported by Google Research Awards for Latin America 2021. RR and TN were supported by a grant from the Terumo Life Science Foundation, CAPES (Coordination for the Improvement of Higher Education Personnel—Brazilian federal government agency), CNPq (Brazilian National Council for Scientific and Technological Development) and FAPESP (Center of Mathematical Sciences Applied to Industry, CEPID-CeMEAI) [2013/07375-0].