Explorative assessment of coronavirus-like short sequences from host-associated and environmental metagenomes

Sci Total Environ. 2021 Nov 1:793:148494. doi: 10.1016/j.scitotenv.2021.148494. Epub 2021 Jun 24.

Abstract

The ongoing COVID-19 pandemic has not only globally caused a high number of causalities, but is also an unprecedented challenge for scientists. False-positive virus detection tests not only aggravate the situation in the healthcare sector, but also provide ground for speculations. Previous studies have highlighted the importance of software choice and data interpretation in virome studies. We aimed to further expand theoretical and practical knowledge in bioinformatics-driven virome studies by focusing on short, virus-like DNA sequences in metagenomic data. Analyses of datasets obtained from different sample types (terrestrial, animal and human related samples) and origins showed that coronavirus-like sequences have existed in host-associated and environmental samples before the current COVID-19 pandemic. In the analyzed datasets, various Betacoronavirus-like sequences were detected that also included SARS-CoV-2 matches. Deepening analyses indicated that the detected sequences are not of viral origin and thus should not be considered in virome profiling approaches. Our study confirms the importance of parameter selection, especially in terms of read length, for reliable virome profiling. Natural environments are an important source of coronavirus-like nucleotide sequences that should be taken into account when virome datasets are analyzed and interpreted. We therefore suggest that processing parameters are carefully selected for SARS-CoV-2 profiling in host related as well as environmental samples in order to avoid incorrect identifications.

Keywords: COVID-19; Coronaviruses; Metagenomics; SARS-CoV-2; Virome profiling.

MeSH terms

  • Animals
  • COVID-19*
  • Humans
  • Metagenome
  • Metagenomics
  • Pandemics*
  • SARS-CoV-2