PipeCoV: a pipeline for SARS-CoV-2 genome assembly, annotation and variant identification

PeerJ. 2022 Apr 13:10:e13300. doi: 10.7717/peerj.13300. eCollection 2022.

Abstract

Motivation: Since the identification of the novel coronavirus (SARS-CoV-2), the scientific community has made a huge effort to understand the virus biology and to develop vaccines. Next-generation sequencing strategies have been successful in understanding the evolution of infectious diseases as well as facilitating the development of molecular diagnostics and treatments. Thousands of genomes are being generated weekly to understand the genetic characteristics of this virus. Efficient pipelines are needed to analyze the vast amount of data generated. Here we present a new pipeline designed for genomic analysis and variant identification of the SARS-CoV-2 virus.

Results: PipeCoV shows better performance when compared to well-established SARS-CoV-2 pipelines, with a lower content of Ns and higher genome coverage when compared to the Wuhan reference. It also provides a variant report not offered by other tested pipelines.

Availability: https://github.com/alvesrco/pipecov.

Keywords: Annotation; Covid19; Genomics; Pipeline; Sarscov2; Variant identification; Virus.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / genetics
  • Genome, Viral / genetics
  • Genomics
  • Humans
  • SARS-CoV-2 / genetics
  • Viruses* / genetics

Grants and funding

This work received financial support from Vale (Projeto Covid, RBRS000603.12) and the CABANA project (RCUK/BB/P027849/1) to Guilherme Oliveira. Guilherme Oliveira is a CNPq (Conselho Nacional de Desenvolvimento Científico) fellow (307479/2016-1), Tatianne Costa Negri is a Fiocruz fellow (VPPCB-007-FEX-20). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.