Machine learning strengthened prediction of tracheal, bronchus, and lung cancer deaths due to air pollution

Environ Sci Pollut Res Int. 2023 Sep;30(45):100539-100551. doi: 10.1007/s11356-023-29448-y. Epub 2023 Aug 28.

Abstract

This work pointed out the use of machine learning tools to predict the effect of CO, O3, CH4, and CO2 on TBL (tracheal, bronchus, and lung cancer) deaths from 1990 to 2019. In this study, data from 203 countries/locations were used. We used evaluation metrics like accuracy, area under curve (AUC), recall, precision, and Matthews correlation coefficient (MCC) to determine the prediction efficiency of the models. The models that yielded accuracy between 89 and 90 were selected in this study. The essential features in the prediction process were extracted, and it was found that CO influenced the prediction process. Extra trees classifier, random forest classifier, gradient boosting classifier, and light gradient boosting machine were selected from 14 other classifiers based on the accuracy metric. The best-performing models, according to our benchmark standards, are the extra trees classifier (90.83%), random forest classifier (89.17%), gradient boosting classifier (89.17%), and light gradient boosting machine (89.17). We conclude that machine learning models can be used in predicting mortality, i.e., the number of deaths, and could assist us in predicting the role of air pollutants on TBL deaths globally.

Keywords: Extra trees classifier; Gradient boosting classifier; Machine learning; Mortality; Radom forest classifier; TBL cancer.

MeSH terms

  • Air Pollutants*
  • Air Pollution*
  • Bronchi
  • Humans
  • Lung Neoplasms*
  • Machine Learning

Substances

  • Air Pollutants