An analytical study on the identification of N-linked glycosylation sites using machine learning model

PeerJ Comput Sci. 2022 Sep 21:8:e1069. doi: 10.7717/peerj-cs.1069. eCollection 2022.

Abstract

N-linked is the most common type of glycosylation which plays a significant role in identifying various diseases such as type I diabetes and cancer and helps in drug development. Most of the proteins cannot perform their biological and psychological functionalities without undergoing such modification. Therefore, it is essential to identify such sites by computational techniques because of experimental limitations. This study aims to analyze and synthesize the progress to discover N-linked places using machine learning methods. It also explores the performance of currently available tools to predict such sites. Almost seventy research articles published in recognized journals of the N-linked glycosylation field have shortlisted after the rigorous filtering process. The findings of the studies have been reported based on multiple aspects: publication channel, feature set construction method, training algorithm, and performance evaluation. Moreover, a literature survey has developed a taxonomy of N-linked sequence identification. Our study focuses on the performance evaluation criteria, and the importance of N-linked glycosylation motivates us to discover resources that use computational methods instead of the experimental method due to its limitations.

Keywords: Artificial intelligence; Deep learning; Glycosylation; Machine learning; N-linked; Performance evaluation criteria.

Grants and funding

The authors received no funding for this work.