Serial co-expression analysis of host factors from SARS-CoV viruses highly converges with former high-throughput screenings and proposes key regulators and co-option of cellular pathways

bioRxiv [Preprint]. 2020 Aug 7:2020.07.28.225078. doi: 10.1101/2020.07.28.225078.

Abstract

The current genomics era is bringing an unprecedented growth in the amount of gene expression data, only comparable to the exponential growth of sequences in databases during the last decades. This data now allows the design of secondary analyses that take advantage of this information to create new knowledge through specific computational approaches. One of these feasible analyses is the evaluation of the expression level for a gene through a series of different conditions or cell types. Based on this idea, we have developed ASACO, Automatic and Serial Analysis of CO-expression, which performs expression profiles for a given gene along hundreds of normalized and heterogeneous transcriptomics experiments and discover other genes that show either a similar or an inverse behavior. It might help to discover co-regulated genes, and even common transcriptional regulators in any biological model, including human diseases or microbial infections. The present SARS-CoV-2 pandemic is an opportunity to test this novel approach due to the wealth of data that is being generated, which could be used for validating results. In addition, new cell mechanisms identified could become new therapeutic targets. Thus, we have identified 35 host factors in the literature putatively involved in the infectious cycle of SARS-CoV and/or SARS-CoV-2 and searched for genes tightly co-expressed with them. We have found around 1900 co-expressed genes whose assigned functions are strongly related to viral cycles. Moreover, this set of genes heavily overlap with those identified by former laboratory high-throughput screenings (with p-value near 0). Some of these genes aim to cellular structures such as the stress granules, which could be essential for the virus replication and thereby could constitute potential targets in the current fight against the virus. Additionally, our results reveal a series of common transcription regulators, involved in immune and inflammatory responses, that might be key virus targets to induce the coordinated expression of SARS-CoV-2 host factors. All of this proves that ASACO can discover gene co-regulation networks with potential for proposing new genes, pathways and regulators participating in particular biological systems.

Highlights: ASACO identifies regulatory associations of genes using public transcriptomics data.ASACO highlights new cell functions likely involved in the infection of coronavirus.Comparison with high-throughput screenings validates candidates proposed by ASACO.Genes co-expressed with host's genes used by SARS-CoV-2 are related to stress granules.

Publication types

  • Preprint