Exploring Volatile Organic Compounds in Breath for High-Accuracy Prediction of Lung Cancer

Cancers (Basel). 2021 Mar 21;13(6):1431. doi: 10.3390/cancers13061431.

Abstract

(1) Background: Lung cancer is silent in its early stages and fatal in its advanced stages. The current examinations for lung cancer are usually based on imaging. Conventional chest X-rays lack accuracy, and chest computed tomography (CT) is associated with radiation exposure and cost, limiting screening effectiveness. Breathomics, a noninvasive strategy, has recently been studied extensively. Volatile organic compounds (VOCs) derived from human breath can reflect metabolic changes caused by diseases and possibly serve as biomarkers of lung cancer. (2) Methods: The selected ion flow tube mass spectrometry (SIFT-MS) technique was used to quantitatively analyze 116 VOCs in breath samples from 148 patients with histologically confirmed lung cancers and 168 healthy volunteers. We used eXtreme Gradient Boosting (XGBoost), a machine learning method, to build a model for predicting lung cancer occurrence based on quantitative VOC measurements. (3) Results: The proposed prediction model achieved better performance than other previous approaches, with an accuracy, sensitivity, specificity, and area under the curve (AUC) of 0.89, 0.82, 0.94, and 0.95, respectively. When we further adjusted the confounding effect of environmental VOCs on the relationship between participants' exhaled VOCs and lung cancer occurrence, our model was improved to reach 0.92 accuracy, 0.96 sensitivity, 0.88 specificity, and 0.98 AUC. (4) Conclusion: A quantitative VOCs databank integrated with the application of an XGBoost classifier provides a persuasive platform for lung cancer prediction.

Keywords: SIFT-MS; XGBoost; breath analysis; lung cancer; machine learning; volatile organic compounds.