Antibacterial Activity Prediction Model of Traditional Chinese Medicine Based on Combined Data-Driven Approach and Machine Learning Algorithm: Constructed and Validated

Front Microbiol. 2021 Nov 22:12:763498. doi: 10.3389/fmicb.2021.763498. eCollection 2021.

Abstract

Traditional Chinese medicines (TCMs), as a unique natural medicine resource, were used to prevent and treat bacterial diseases in China with a long history. To provide a prediction model of screening antibacterial TCMs for the design and discovery of novel antibacterial agents, the literature about antibacterial TCMs in the China National Knowledge Infrastructure (CNKI) and Web of Science database was retrieved. The data were extracted and standardized. A total of 28,786 pieces of data from 904 antibacterial TCMs were collected. The data of plant medicine were the most numerous. The result of association rules mining showed a high correlation between antibacterial activity with cold nature, bitter and sour tastes, hemostatic, and purging fire efficacies. Moreover, TCMs with antibacterial activity showed a specific aggregation in the phylogenetic tree; 92% of them came from Tracheophyta, of which 74% were mainly concentrated in rosids, asterids, Liliopsida, and Ranunculales. The prediction models of anti-Escherichia coli and anti-Staphylococcus aureus activity, with AUC values (the area under the ROC curve) of 77.5 and 80.0%, respectively, were constructed by the Neural Networks (NN) algorithm after Bagged Classification and Regression Tree (Bagged CART) and Linear Discriminant Analysis (LDA) selection. The in vitro experimental results showed the prediction accuracy of these two models was 75 and 60%, respectively. Four TCMs (Cirsii Japonici Herba Carbonisata, Changii Radix, Swertiae Herba, Callicarpae Formosanae Folium) were proposed for the first time to show antibacterial activity against E. coli and/or S. aureus. The results implied that the prediction model of antibacterial activity of TCMs based on properties and families showed certain prediction ability, which was of great significance to the screening of antibacterial TCMs and can be used to discover novel antibacterial agents.

Keywords: antibacterial activity; distribution law; machine learning; model construction; traditional Chinese medicine (TCM).