Comparative Genomics, from the Annotated Genome to Valuable Biological Information: A Case Study

Methods Mol Biol. 2021:2242:91-112. doi: 10.1007/978-1-0716-1099-2_7.

Abstract

High availability of fast, cheap, and high-throughput next generation sequencing techniques resulted in acquisition of numerous de novo sequenced and assembled bacterial genomes. It rapidly became clear that digging out useful biological information from such a huge amount of data presents a considerable challenge. In this chapter we share our experience with utilization of several handy open source comparative genomic tools. All of them were applied in the studies focused on revealing inter- and intraspecies variation in pectinolytic plant pathogenic bacteria classified to Dickeya solani and Pectobacterium parmentieri. As the described software performed well on the species within the Pectobacteriaceae family, it presumably may be readily utilized on some closely related taxa from the Enterobacteriaceae family. First of all, implementation of various annotation software is discussed and compared. Then, tools computing whole genome comparisons including generation of circular juxtapositions of multiple sequences, revealing the order of synteny blocks or calculation of ANI or Tetra values are presented. Besides, web servers intended either for functional annotation of the genes of interest or for detection of genomic islands, plasmids, prophages, CRISPR/Cas are described. Last but not least, utilization of the software designed for pangenome studies and the further downstream analyses is explained. The presented work not only summarizes broad possibilities assured by the comparative genomic approach but also provides a user-friendly guide that might be easily followed by nonbioinformaticians interested in undertaking similar studies.

Keywords: Annotation; CRISPR/Cas; Dickeya spp.; Next generation sequencing; Pangenome study; Pectobacterium spp.; Phages; Synteny.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Bacterial / genetics*
  • Databases, Genetic
  • Dickeya / genetics*
  • Genome, Bacterial*
  • Genomics*
  • High-Throughput Nucleotide Sequencing*
  • Pectobacterium / genetics*
  • Research Design
  • Sequence Analysis, DNA*
  • Software Design
  • Workflow

Substances

  • DNA, Bacterial

Supplementary concepts

  • Dickeya solani
  • Pectobacterium parmentieri