Linking Phenotypes and Genotypes with Matrix Factorizations

Curr Pharm Biotechnol. 2023;24(12):1576-1588. doi: 10.2174/1389201024666230207153738.

Abstract

Aims: We linked phenotypes and genotypes by PheGe-Net, a unified operation frame.

Background: Genotype refers to the general name of all gene combinations of an individual. It reflects the genetic composition of organisms. Phenotype refers to the macroscopic characteristics of an organism that can be observed.

Objective: Identifying the phenotype-genotype association assists in the explanation of the pathogenesis and the progress of genomic medicine.

Methods: PheGe-Net exploited the similarity net of phenotypes and genotypes and recognized phenotype-genotype relationships to discover their hidden interactions.

Results: By conducting experiments with a real-world dataset, the validity of our PheGe-Net is verified. Our method outperformed the second-best one by around 3% on Accuracy and NMI when clustering the phenotype/genotype; it also successfully detected phenotype-genotype associations, for example, the association for obesity (OMIM ID: 601665) was analyzed, and among the top ten scored genes, two known ones were assigned with scores more than 0.75, and other eight predicted ones are also explainable.

Conclusion: PheGe-Net is not only able to discover latent phenotype or genotype clusters but also can uncover the hidden relationships among them, as long as there are known similarity networks of phenotype, genotype, and acknowledged pheno-genotype relationships.

Keywords: Phenotype; block coordinate descent; constrained nonlinear optimization; genotype; joint matrix factorization; phenotype-genotype association.

MeSH terms

  • Genotype*
  • Phenotype