Classification of MLH1 Missense VUS Using Protein Structure-Based Deep Learning-Ramachandran Plot-Molecular Dynamics Simulations Method

Int J Mol Sci. 2024 Jan 10;25(2):850. doi: 10.3390/ijms25020850.

Abstract

Pathogenic variation in DNA mismatch repair (MMR) gene MLH1 is associated with Lynch syndrome (LS), an autosomal dominant hereditary cancer. Of the 3798 MLH1 germline variants collected in the ClinVar database, 38.7% (1469) were missense variants, of which 81.6% (1199) were classified as Variants of Uncertain Significance (VUS) due to the lack of functional evidence. Further determination of the impact of VUS on MLH1 function is important for the VUS carriers to take preventive action. We recently developed a protein structure-based method named "Deep Learning-Ramachandran Plot-Molecular Dynamics Simulation (DL-RP-MDS)" to evaluate the deleteriousness of MLH1 missense VUS. The method extracts protein structural information by using the Ramachandran plot-molecular dynamics simulation (RP-MDS) method, then combines the variation data with an unsupervised learning model composed of auto-encoder and neural network classifier to identify the variants causing significant change in protein structure. In this report, we applied the method to classify 447 MLH1 missense VUS. We predicted 126/447 (28.2%) MLH1 missense VUS were deleterious. Our study demonstrates that DL-RP-MDS is able to classify the missense VUS based solely on their impact on protein structure.

Keywords: MLH1; Ramachandran plot; VUS; autoencoder; deep learning; molecular dynamics simulation; neural network.

MeSH terms

  • Colorectal Neoplasms, Hereditary Nonpolyposis* / genetics
  • DNA Mismatch Repair
  • Databases, Factual
  • Deep Learning*
  • Humans
  • Molecular Dynamics Simulation
  • MutL Protein Homolog 1* / genetics

Substances

  • MLH1 protein, human
  • MutL Protein Homolog 1