ASPTF: A computational tool to predict abiotic stress-responsive transcription factors in plants by employing machine learning algorithms

Biochim Biophys Acta Gen Subj. 2024 Jun;1868(6):130597. doi: 10.1016/j.bbagen.2024.130597. Epub 2024 Mar 14.

Abstract

Background: Abiotic stresses pose serious threat to the growth and yield of crop plants. Several studies suggest that in plants, transcription factors (TFs) are important regulators of gene expression, especially when it comes to coping with abiotic stresses. Therefore, it is crucial to identify TFs associated with abiotic stress response for breeding of abiotic stress tolerant crop cultivars.

Methods: Based on a machine learning framework, a computational model was envisaged to predict TFs associated with abiotic stress response in plants. To numerically encode TF sequences, four distinct sequence derived features were generated. The prediction was performed using ten shallow learning and four deep learning algorithms. For prediction using more pertinent and informative features, feature selection techniques were also employed.

Results: Using the features chosen by the light-gradient boosting machine-variable importance measure (LGBM-VIM), the LGBM achieved the highest cross-validation performance metrics (accuracy: 86.81%, auROC: 92.98%, and auPRC: 94.03%). Further evaluation of the proposed model (LGBM prediction method + LGBM-VIM selected features) was also done using an independent test dataset, where the accuracy, auROC and auPRC were observed 81.98%, 90.65% and 91.30%, respectively.

Conclusions: To facilitate the adoption of the proposed strategy by users, the approach was implemented as a prediction server called ASPTF, accessible at https://iasri-sg.icar.gov.in/asptf/. The developed approach and the corresponding web application are anticipated to supplement experimental methods in the identification of transcription factors (TFs) responsive to abiotic stress in plants.

Keywords: Abiotic stress; Bioinformatics; Computational model; Machine learning; Transcription factor.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Gene Expression Regulation, Plant
  • Machine Learning*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Plants / genetics
  • Plants / metabolism
  • Stress, Physiological*
  • Transcription Factors* / genetics
  • Transcription Factors* / metabolism

Substances

  • Transcription Factors
  • Plant Proteins