A curated dataset of peste des petits ruminants virus sequences for molecular epidemiological analyses

PLoS One. 2022 Feb 10;17(2):e0263616. doi: 10.1371/journal.pone.0263616. eCollection 2022.

Abstract

Peste des petits ruminants (PPR) is a highly contagious and devastating viral disease infecting predominantly sheep and goats. Tracking outbreaks of disease and analysing the movement of the virus often involves sequencing part or all of the genome and comparing the sequence obtained with sequences from other outbreaks, obtained from the public databases. However, there are a very large number (>1800) of PPRV sequences in the databases, a large majority of them relatively short, and not always well-documented. There is also a strong bias in the composition of the dataset, with countries with good sequencing capabilities (e.g. China, India, Turkey) being overrepresented, and most sequences coming from isolates in the last 20 years. In order to facilitate future analyses, we have prepared sets of PPRV sequences, sets which have been filtered for sequencing errors and unnecessary duplicates, and for which date and location information has been obtained, either from the database entry or from other published sources. These sequence datasets are freely available for download, and include smaller datasets which maximise phylogenetic information from the minimum number of sequences, and which will be useful for simple lineage identification. Their utility is illustrated by uploading the data to the MicroReact platform to allow simultaneous viewing of lineage date and geographic information on all the viruses for which we have information. While preparing these datasets, we identified a significant number of public database entries which contain clear errors, and propose guidelines on checking new sequences and completing metadata before submission.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Curation
  • Epidemiologic Methods*
  • Genome, Viral*
  • Humans
  • Peste-des-petits-ruminants virus / genetics*
  • RNA, Viral*
  • Recombination, Genetic
  • Sequence Analysis, RNA*
  • Whole Genome Sequencing

Substances

  • RNA, Viral

Grants and funding

MDB received no specific funding for this work. The Article Publishing Charge was paid by the UK Biotechnology and Biological Sciences Research Council through the core grant to The Pirbright Institute. AB is supported by a grant (SI2.756606) from the European Commission Directorate General for Health and Food Safety awarded to the European Union Reference Laboratory for Peste des Petits Ruminants (EURL-PPR). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.