The rate and role of pseudogenes of the Mycobacterium tuberculosis complex

Microb Genom. 2022 Oct;8(10):mgen000876. doi: 10.1099/mgen.0.000876.

Abstract

Whole-genome sequence analyses have significantly contributed to the understanding of virulence and evolution of the Mycobacterium tuberculosis complex (MTBC), the causative pathogens of tuberculosis. Most MTBC evolutionary studies are focused on single nucleotide polymorphisms and deletions, but rare studies have evaluated gene content, whereas none has comprehensively evaluated pseudogenes. Accordingly, we describe an extensive study focused on quantifying and predicting possible functions of MTBC and Mycobacterium canettii pseudogenes. Using NCBI's PGAP-detected pseudogenes, we analysed 25 837 pseudogenes from 158 MTBC and M. canetii strains and combined transcriptomics and proteomics of M. tuberculosis H37Rv to gain insights about pseudogenes' expression. Our results indicate significant variability concerning rate and conservancy of in silico predicted pseudogenes among different ecotypes and lineages of tuberculous mycobacteria and pseudogenization of important virulence factors and genes of the metabolism and antimicrobial resistance/tolerance. We show that in silico predicted pseudogenes contribute considerably to MTBC genetic diversity at the population level. Moreover, the transcription machinery of M. tuberculosis can fully transcribe most pseudogenes, indicating intact promoters and recent pseudogene evolutionary emergence. Proteomics of M. tuberculosis and close evaluation of mutational lesions driving pseudogenization suggest that few in silico predicted pseudogenes are likely capable of neofunctionalization, nonsense mutation reversal, or phase variation, contradicting the classical definition of pseudogenes. Such findings indicate that genome annotation should be accompanied by proteomics and protein function assays to improve its accuracy. While indels and insertion sequences are the main drivers of the observed mutational lesions in these species, population bottlenecks and genetic drift are likely the evolutionary processes acting on pseudogenes' emergence over time. Our findings unveil a new perspective on MTBC's evolution and genetic diversity.

Keywords: Mycobacterium tuberculosis complex; comparative genomics; frameshift; loss of function mutations; phase variation; pseudogenes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anti-Infective Agents
  • Codon, Nonsense
  • DNA Transposable Elements
  • Drug Resistance, Bacterial / genetics
  • Mycobacterium tuberculosis* / drug effects
  • Mycobacterium tuberculosis* / genetics
  • Pseudogenes* / genetics
  • Virulence Factors / genetics

Substances

  • Anti-Infective Agents
  • Codon, Nonsense
  • DNA Transposable Elements
  • Virulence Factors