Prediction of functional microexons by transfer learning

BMC Genomics. 2021 Nov 26;22(1):855. doi: 10.1186/s12864-021-08187-9.

Abstract

Background: Microexons are a particular kind of exon of less than 30 nucleotides in length. More than 60% of annotated human microexons were found to have high levels of sequence conservation, suggesting their potential functions. There is thus a need to develop a method for predicting functional microexons.

Results: Given the lack of a publicly available functional label for microexons, we employed a transfer learning skill called Transfer Component Analysis (TCA) to transfer the knowledge obtained from feature mapping for the prediction of functional microexons. To provide reference knowledge, microindels were chosen because of their similarities to microexons. Then, Support Vector Machine (SVM) was used to train a classification model in the newly built feature space for the functional microindels. With the trained model, functional microexons were predicted. We also built a tool based on this model to predict other functional microexons. We then used this tool to predict a total of 19 functional microexons reported in the literature. This approach successfully predicted 16 out of 19 samples, giving accuracy greater than 80%.

Conclusions: In this study, we proposed a method for predicting functional microexons and applied it, with the predictive results being largely consistent with records in the literature.

Keywords: Functional prediction; Microexon; Microindel; Transfer learning.

MeSH terms

  • Exons
  • Humans
  • Support Vector Machine*