Computational Methods for Prediction of Human Protein-Phenotype Associations: A Review

Phenomics. 2021 Aug 6;1(4):171-185. doi: 10.1007/s43657-021-00019-w. eCollection 2021 Aug.

Abstract

Deciphering the relationship between human proteins (genes) and phenotypes is one of the fundamental tasks in phenomics research. The Human Phenotype Ontology (HPO) builds upon a standardized logical vocabulary to describe the abnormal phenotypes encountered in human diseases and paves the way towards the computational analysis of their genetic causes. To date, many computational methods have been proposed to predict the HPO annotations of proteins. In this paper, we conduct a comprehensive review of the existing approaches to predicting HPO annotations of novel proteins, identifying missing HPO annotations, and prioritizing candidate proteins with respect to a certain HPO term. For each topic, we first give the formalized description of the problem, and then systematically revisit the published literatures highlighting their advantages and disadvantages, followed by the discussion on the challenges and promising future directions. In addition, we point out several potential topics to be worthy of exploration including the selection of negative HPO annotations and detecting HPO misannotations. We believe that this review will provide insight to the researchers in the field of computational phenotype analyses in terms of comprehending and developing novel prediction algorithms.

Keywords: Deep learning; HPO annotation; Human Phenotype Ontology (HPO); Human protein-phenotype association; Machine learning.

Publication types

  • Review