LTPConstraint: a transfer learning based end-to-end method for RNA secondary structure prediction

BMC Bioinformatics. 2022 Aug 23;23(1):354. doi: 10.1186/s12859-022-04847-z.

Abstract

Background: RNA secondary structure is very important for deciphering cell's activity and disease occurrence. The first method which was used by the academics to predict this structure is biological experiment, But this method is too expensive, causing the promotion to be affected. Then, computing methods emerged, which has good efficiency and low cost. However, the accuracy of computing methods are not satisfactory. Many machine learning methods have also been applied to this area, but the accuracy has not improved significantly. Deep learning has matured and achieves great success in many areas such as computer vision and natural language processing. It uses neural network which is a kind of structure that has good functionality and versatility, but its effect is highly correlated with the quantity and quality of the data. At present, there is no model with high accuracy, low data dependence and high convenience in predicting RNA secondary structure.

Results: This paper designs a neural network called LTPConstraint to predict RNA secondary structure. The network is based on many network structure such as Bidirectional LSTM, Transformer and generator. It also uses transfer learning to train modelso that the data dependence can be reduced.

Conclusions: LTPConstraint has achieved high accuracy in RNA secondary structure prediction. Compared with the previous methods, the accuracy improves obviously both in predicting the structure with pseudoknot and the structure without pseudoknot. At the same time, LTPConstraint is easy to operate and can achieve result very quickly.

Keywords: Bi-LSTM; Generator; RNA secondary structure; Transfer learning; Transformer.

MeSH terms

  • Machine Learning
  • Neural Networks, Computer*
  • Protein Structure, Secondary
  • RNA* / chemistry

Substances

  • RNA