Automated Microbial Library Generation Using the Bioinformatics Platform IDBac

Molecules. 2022 Mar 22;27(7):2038. doi: 10.3390/molecules27072038.

Abstract

Libraries of microorganisms have served as a cornerstone of therapeutic drug discovery, though the continued re-isolation of known natural product chemical entities has remained a significant obstacle to discovery efforts. A major contributing factor to this redundancy is the duplication of bacterial taxa in a library, which can be mitigated through the use of a variety of DNA sequencing strategies and/or mass spectrometry-informed bioinformatics platforms so that the library is created with minimal phylogenetic, and thus minimal natural product overlap. IDBac is a MALDI-TOF mass spectrometry-based bioinformatics platform used to assess overlap within collections of environmental bacterial isolates. It allows environmental isolate redundancy to be reduced while considering both phylogeny and natural product production. However, manually selecting isolates for addition to a library during this process was time intensive and left to the researcher's discretion. Here, we developed an algorithm that automates the prioritization of hundreds to thousands of environmental microorganisms in IDBac. The algorithm performs iterative reduction of natural product mass feature overlap within groups of isolates that share high homology of protein mass features. Employing this automation serves to minimize human bias and greatly increase efficiency in the microbial strain prioritization process.

Keywords: IDBac; MALDI; bioinformatics; drug discovery; microorganisms; natural products.

MeSH terms

  • Bacteria / genetics
  • Biological Products* / chemistry
  • Computational Biology*
  • Gene Library
  • Humans
  • Phylogeny
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization

Substances

  • Biological Products