Genome-Wide Analysis of MDR and XDR Tuberculosis from Belarus: Machine-Learning Approach

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1398-1408. doi: 10.1109/TCBB.2017.2720669. Epub 2017 Jun 27.

Abstract

Emergence of drug-resistant microorganisms has been recognized as a serious threat to public health worldwide. This problem is extensively discussed in the context of tuberculosis treatment. Alterations in pathogen genomes are among the main mechanisms by which microorganisms exhibit drug resistance. Analysis of 144 M. tuberculosis strains of different phenotypes including drug susceptible, MDR, and XDR isolated in Belarus was fulfilled in this paper. A wide range of machine learning methods that can discover SNPs related to drug-resistance in the whole bacteria genomes was investigated. Besides single-SNP testing approaches, methods that allow detecting joint effects from interacting SNPs were considered. We proposed a framework for automated selection of the best performing statistical model in terms of recall, precision, and accuracy to identify drug resistance-associated mutations. Analysis of whole-genome sequences often leads to situations where the number of treated features exceeds the number of available observations. For this reason, special attention is paid to fair evaluation of the model prediction quality and minimizing the risk of overfitting while estimating the underlying parameters. Results of our experiments aimed at identifying top-scoring resistance mutations to the major first-line and second-line anti-TB drugs are presented.

MeSH terms

  • Algorithms
  • Alleles
  • Antitubercular Agents / pharmacology
  • Bayes Theorem
  • Drug Resistance, Bacterial
  • Extensively Drug-Resistant Tuberculosis / epidemiology
  • Extensively Drug-Resistant Tuberculosis / microbiology*
  • Genetic Variation
  • Genome
  • Genome-Wide Association Study*
  • Humans
  • Machine Learning
  • Models, Statistical
  • Mutation
  • Mycobacterium tuberculosis / genetics*
  • Phenotype
  • Phylogeny
  • Principal Component Analysis
  • Republic of Belarus / epidemiology
  • Tuberculosis, Multidrug-Resistant / epidemiology
  • Tuberculosis, Multidrug-Resistant / microbiology*

Substances

  • Antitubercular Agents