Pan-genome association study of Mycobacterium tuberculosis lineage-4 revealed specific genes related to the high and low prevalence of the disease in patients from the North-Eastern area of Medellín, Colombia

Front Microbiol. 2023 Jan 4:13:1076797. doi: 10.3389/fmicb.2022.1076797. eCollection 2022.

Abstract

Mycobacterium tuberculosis (Mtb) lineage 4 is responsible for the highest burden of tuberculosis (TB) worldwide. This lineage has been the most prevalent lineage in Colombia, especially in the North-Eastern (NE) area of Medellin, where it has been shown to have a high prevalence of LAM9 SIT42 and Haarlem1 SIT62 sublineages. There is evidence that regardless of environmental factors and host genetics, differences among sublineages of Mtb strains play an important role in the course of infection and disease. Nevertheless, the genetic basis of the success of a sublineage in a specific geographic area remains uncertain. We used a pan-genome-wide association study (pan-GWAS) of 47 Mtb strains isolated from NE Medellin between 2005 and 2008 to identify the genes responsible for the phenotypic differences among high and low prevalence sublineages. Our results allowed the identification of 12 variants in 11 genes, of which 4 genes showed the strongest association to low prevalence (mmpL12, PPE29, Rv1419, and Rv1762c). The first three have been described as necessary for invasion and intracellular survival. Polymorphisms identified in low prevalence isolates may suggest related to a fitness cost of Mtb, which might reflect a decrease in their capacity to be transmitted or to cause an active infection. These results contribute to understanding the success of some sublineages of lineage-4 in a specific geographical area.

Keywords: M. tuberculosis; pan-GWAS; pan-genome; prevalence; sublineage; transmission; variant.

Grants and funding

This work was funded by the Ministerio de Ciencia Tecnología e Innovación–Minciencias (Project code: 657057636375).