CodingDiv: analyzing SNP-level microdiversity to discriminate between coding and noncoding regions in viral genomes

Bioinformatics. 2023 Jul 1;39(7):btad408. doi: 10.1093/bioinformatics/btad408.

Abstract

Summary: Viral genes, that are frequently small genes and/or with large overlaps, are still difficult to predict accurately. To help predict all genes in viral genomes, we provide CodingDiv that detects SNP-level microdiversity of all potential coding regions, using metagenomic reads and/or similar sequences from external databases. Protein coding regions can then be identified as the ones containing more synonymous SNPs than unfavorable nonsynonymous substitutions SNPs.

Availability and implementation: CodingDiv is released under the GPL license. Source code is available at https://github.com/ericolo/codingDiv. The software can be installed and used through a docker container.

MeSH terms

  • Databases, Factual
  • Genome, Viral
  • Metagenomics
  • Polymorphism, Single Nucleotide*
  • Software*