Gene expression models based on a reference laboratory strain are poor predictors of Mycobacterium tuberculosis complex transcriptional diversity

Sci Rep. 2018 Feb 28;8(1):3813. doi: 10.1038/s41598-018-22237-5.

Abstract

Every year, species of the Mycobacterium tuberculosis complex (MTBC) kill more people than any other infectious disease caused by a single agent. As a consequence of its global distribution and parallel evolution with the human host the bacteria is not genetically homogeneous. The observed genetic heterogeneity has relevance at different phenotypic levels, from gene expression to epidemiological dynamics. However, current systems biology datasets have focused on the laboratory reference strain H37Rv. By using large expression datasets testing the role of almost two hundred transcription factors, we have constructed computational models to grab the expression dynamics of Mycobacterium tuberculosis H37Rv genes. However, we have found that many of those transcription factors are deleted or likely dysfunctional across strains of the MTBC. As a result, we failed to predict expression changes in strains with a different genetic background when compared with experimental data. These results highlight the importance of designing systems biology approaches that take into account the genetic diversity of tubercle bacilli, or any other pathogen, if we want to identify universal targets for vaccines, diagnostics and treatments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Conserved Sequence
  • Gene Expression Profiling / standards*
  • Gene Regulatory Networks
  • Genes, Bacterial / genetics
  • Genetic Variation*
  • Genomics
  • Models, Statistical*
  • Mutation
  • Mycobacterium tuberculosis / genetics*
  • Reference Standards
  • Transcription, Genetic*