A Transferable Machine Learning Framework for Predicting Transcriptional Responses of Genes Across Species

Methods Mol Biol. 2023:2698:361-379. doi: 10.1007/978-1-0716-3354-0_21.

Abstract

Leveraging existing resources in studied species to predict gene functions has the potential to rapidly expand understanding of annotated genes in other, less well-studied, species with assembled genomes. However, orthology is not a reliable predictor for the transcriptional responses of genes to stress. Machine learning methods can quantitatively estimate expression patterns and gene functions using known annotations and collections of features describing each gene. In this chapter, we describe a supervised machine learning framework to predict stress-responsive genes across species using only features derived from nucleotide sequences, using the example of cold stress-responsive genes in different Panicoid grass species.

Keywords: Dinucleotide frequency; Gene annotation; Grasses; Machine learning; Random forest; Transfer learning.

MeSH terms

  • Cold-Shock Response
  • Machine Learning*
  • Poaceae / genetics
  • Supervised Machine Learning*