cRegions-a tool for detecting conserved cis-elements in multiple sequence alignment of diverged coding sequences

PeerJ. 2019 Jan 10:6:e6176. doi: 10.7717/peerj.6176. eCollection 2019.

Abstract

Identifying cis-acting elements and understanding regulatory mechanisms of a gene is crucial to fully understand the molecular biology of an organism. In general, it is difficult to identify previously uncharacterised cis-acting elements with an unknown consensus sequence. The task is especially problematic with viruses containing regions of limited or no similarity to other previously characterised sequences. Fortunately, the fast increase in the number of sequenced genomes allows us to detect some of these elusive cis-elements. In this work, we introduce a web-based tool called cRegions. It was developed to identify regions within a protein-coding sequence where the conservation in the amino acid sequence is caused by the conservation in the nucleotide sequence. The cRegion can be the first step in discovering novel cis-acting sequences from diverged protein-coding genes. The results can be used as a basis for future experimental analysis. We applied cRegions on the non-structural and structural polyproteins of alphaviruses as an example and successfully detected all known cis-acting elements. In this publication and in previous work, we have shown that cRegions is able to detect a wide variety of functional elements in DNA and RNA viruses. These functional elements include splice sites, stem-loops, overlapping reading frames, internal promoters, ribosome frameshifting signals and other embedded elements with yet unknown function. The cRegions web tool is available at http://bioinfo.ut.ee/cRegions/.

Keywords: Alphavirus; Cis-acting sequence; Cis-element; Codon usage bias; Embedded functional element; Multiple sequence alignment analysis; Viruses.

Grants and funding

The development of the cRegions webpage was supported by the European Regional Development Fund through the Research Internationalization Programme (ELIXIR) and Lydia and Felix Krabi scholarship. Aare Abroi’s work was supported partially by ‘Basic research financing’ to Estonian Biocentre and partially by grant PRG198 from Estonian Research Council to prof. Mart Ustav. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.