Applying Data Mining to Investigate Cancer Risk in Patients with Pyogenic Liver Abscess

Healthcare (Basel). 2020 May 22;8(2):141. doi: 10.3390/healthcare8020141.

Abstract

Pyogenic liver abscess is usually a complication of biliary tract disease. Taiwan features among the countries with the highest incidence of colorectal cancer (CRC) and hepatocellular carcinoma (HCC). Few studies have investigated whether patients with pyogenic liver abscess (PLA) have higher incidence rates of CRC and HCC. However, these findings have been inconclusive. The risks of CRC and HCC in patients with PLA and the factors contributing to cancer development were assessed in these patients. The clinical tests significantly associated with cancers in these patients with PLA were determined to assist in the early diagnosis of these cancers. Odds ratios (ORs) and 95% confidence intervals (CIs) were determined using binary logistic regression Cancer classification models were constructed using the decision tree algorithm C5.0 to compare the accuracy among different models with those risk factors of cancers and then determine the optimal model. Thereafter, the rules were summarized using the decisi8on tree model to assist in the diagnosis. The results indicated that CRC and HCC (OR, 3.751; 95% CI, 1.149-12.253) and CRC (OR, 6.838; 95% CI, 2.679-17.455) risks were higher in patients with PLA than those without PLA. The decision tree analysis demonstrated that the model with the PLA variable had the highest accuracy, and that classification could be conducted using fewer factors, indicating that PLA is critical in HCC and CRC. Two rules were determined for assisting in the diagnosis of CRC and HCC using the decision tree model.

Keywords: cancer risk; colon cancer; data mining; decision tree; hepatocellular carcinoma; pyogenic liver abscess.