Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data

Sci Rep. 2023 Jul 13;13(1):11368. doi: 10.1038/s41598-023-38384-3.

Abstract

Bacterial strain-types in the Mycobacterium tuberculosis complex underlie tuberculosis disease, and have been associated with drug resistance, transmissibility, virulence, and host-pathogen interactions. Spoligotyping was developed as a molecular genotyping technique used to determine strain-types, though recent advances in whole genome sequencing (WGS) technology have led to their characterization using SNP-based sub-lineage nomenclature. Notwithstanding, spoligotyping remains an important tool and there is a need to study the congruence between spoligotyping-based and SNP-based sub-lineage assignation. To achieve this, an in silico spoligotype prediction method ("Spolpred2") was developed and integrated into TB-Profiler. Lineage and spoligotype predictions were generated for > 28 k isolates and the overlap between strain-types was characterized. Major spoligotype families detected were Beijing (25.6%), T (18.6%), LAM (13.1%), CAS (9.4%), and EAI (8.3%), and these broadly followed known geographic distributions. Most spoligotypes were perfectly correlated with the main MTBC lineages (L1-L7, plus animal). Conversely, at lower levels of the sub-lineage system, the relationship breaks down, with only 65% of spoligotypes being perfectly associated with a sub-lineage at the second or subsequent levels of the hierarchy. Our work supports the use of spoligotyping (membrane or WGS-based) for low-resolution surveillance, and WGS or SNP-based systems for higher-resolution studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Typing Techniques
  • Beijing
  • Drug Resistance
  • Genotype
  • Humans
  • Mycobacterium tuberculosis*
  • Tuberculosis* / microbiology