Effective Identification of Bacterial Genomes From Short and Long Read Sequencing Data

IEEE/ACM Trans Comput Biol Bioinform. 2022 Sep-Oct;19(5):2806-2816. doi: 10.1109/TCBB.2021.3095164. Epub 2022 Oct 10.

Abstract

With the development of sequencing technology, microbiological genome sequencing analysis has attracted extensive attention. For inexperienced users without sufficient bioinformatics skills, making sense of sequencing data for microbial identification, especially for bacterial identification, through reads analysis is still challenging. In order to address the challenge of effectively analyzing genomic information, in this paper, we develop an effective approach and automatic bioinformatics pipeline called PBGI for bacterial genome identification, performing automatedly and customized bioinformatics analysis using short-reads or long-reads sequencing data produced by multiple platforms such as Illumina, PacBio and Oxford Nanopore. An evaluation of the proposed approach on the practical data set is presented, showing that PBGI provides a user-friendly way to perform bacterial identification through short or long reads analysis, and could provide accurate analyzing results. The source code of the PBGI is freely available at https://github.com/lyotvincent/PBGI.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome, Bacterial* / genetics
  • Genomics
  • High-Throughput Nucleotide Sequencing* / methods
  • Sequence Analysis, DNA / methods
  • Software