Identifying stress responsive genes using overlapping communities in co-expression networks

BMC Bioinformatics. 2021 Nov 7;22(1):541. doi: 10.1186/s12859-021-04462-4.

Abstract

Background: This paper proposes a workflow to identify genes that respond to specific treatments in plants. The workflow takes as input the RNA sequencing read counts and phenotypical data of different genotypes, measured under control and treatment conditions. It outputs a reduced group of genes marked as relevant for treatment response. Technically, the proposed approach is both a generalization and an extension of WGCNA. It aims to identify specific modules of overlapping communities underlying the co-expression network of genes. Module detection is achieved by using Hierarchical Link Clustering. The overlapping nature of the systems' regulatory domains that generate co-expression can be identified by such modules. LASSO regression is employed to analyze phenotypic responses of modules to treatment.

Results: The workflow is applied to rice (Oryza sativa), a major food source known to be highly sensitive to salt stress. The workflow identifies 19 rice genes that seem relevant in the response to salt stress. They are distributed across 6 modules: 3 modules, each grouping together 3 genes, are associated to shoot K content; 2 modules of 3 genes are associated to shoot biomass; and 1 module of 4 genes is associated to root biomass. These genes represent target genes for the improvement of salinity tolerance in rice.

Conclusions: A more effective framework to reduce the search-space for target genes that respond to a specific treatment is introduced. It facilitates experimental validation by restraining efforts to a smaller subset of genes of high potential relevance.

Keywords: Co-expression network; LASSO; Oryza sativa; Overlapping communities; Phenotypic traits; Rice; Salinity; Stress-responsive genes.

MeSH terms

  • Genotype
  • Oryza* / genetics
  • Salt Tolerance
  • Sequence Analysis, RNA
  • Stress, Physiological / genetics