Novel Network Method Major Minor Variation Clustering Enables Identification of Poliovirus Clusters with High-Resolution Linkages

J Comput Biol. 2023 Apr;30(4):409-419. doi: 10.1089/cmb.2022.0292. Epub 2022 Sep 16.

Abstract

The Global Polio Eradication Initiative uses an outbreak response protocol that defines type 2 Sabin or Sabin-like virus as those with 0-5 nucleotides diverging from their parental strain in the complete VP1 genomic region. Sabin or Sabin-like viruses share highly similar genome sequences, regardless of their origin. Thus, it is challenging to distinguish viruses at a higher resolution to detect polio clusters or trace sources for local transmissions of viruses at an early stage. To identify type 2 Sabin or Sabin-like sources and improve our ability to map viral sources to campaigns during the polio endgame, we investigated the feasibility of a new method for genetic sequence analysis. We named the method Major Minor Variation Clustering (MMVC), which uses a network model to simultaneously incorporate sequence similarity in major and minor variants in addition to onset dates to detect fine-scale polio clusters. Each identified cluster represents a collection of sequences that are highly similar in both major and minor variants, enabling the discovery of new links between viruses. By applying the method to a published data set collected in Nigeria during 2009-2012, we found that clusters identified using this method have several improvements over clusters derived from a phylogenetic tree approach. Integrative data analysis reveals that sequences in the same cluster have greater genomic similarities and better agreement with onset dates. As a complement to current phylogenetic tree approaches, MMVC has the potential to improve epidemiological surveillance and investigation precision to guide polio eradication.

Keywords: genomic epidemiology; network modeling; polio eradication; whole genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Genomics
  • Humans
  • Phylogeny
  • Poliomyelitis* / epidemiology
  • Poliovirus* / genetics