Construction of a Searchable Database for Gene Expression Changes in Spinal Cord Injury Experiments

J Neurotrauma. 2023 Nov 23. doi: 10.1089/neu.2023.0035. Online ahead of print.

Abstract

Spinal cord injury (SCI) is a debilitating condition with an estimated 18,000 new cases annually in the United States. The field has accepted and adopted standardized databases such as the Open Data Commons for Spinal Cord Injury (ODC-SCI) to aid in broader analyses, but these currently lack high-throughput data despite the availability of nearly 6000 samples from over 90 studies available in the Sequence Read Archive. This limits the potential for large datasets to enhance our understanding of SCI-related mechanisms at the molecular and cellular level. Therefore, we have developed a protocol for processing RNA-Seq samples from high-throughput sequencing experiments related to SCI resulting in both raw and normalized data that can be efficiently mined for comparisons across studies, as well as homologous discovery across species. We have processed 1196 publicly available RNA-Seq samples from 50 bulk RNA-Seq studies across nine different species, resulting in an SQLite database that can be used by the SCI research community for further discovery. We provide both the database as well as a web-based front-end that can be used to query the database for genes of interest, differential gene expression, genes with high variance, and gene set enrichments.

Keywords: ODC-SCI; RNA-Seq; SCI; SQLite; bulk RNA-Seq; differential gene expression; spinal cord injury; transcriptomics.