A weighted rule based method for predicting malignancy of pulmonary nodules by nodule characteristics

J Biomed Inform. 2015 Aug:56:69-79. doi: 10.1016/j.jbi.2015.05.011. Epub 2015 May 22.

Abstract

Predicting malignancy of solitary pulmonary nodules from computer tomography scans is a difficult and important problem in the diagnosis of lung cancer. This paper investigates the contribution of nodule characteristics in the prediction of malignancy. Using data from Lung Image Database Consortium (LIDC) database, we propose a weighted rule based classification approach for predicting malignancy of pulmonary nodules. LIDC database contains CT scans of nodules and information about nodule characteristics evaluated by multiple annotators. In the first step of our method, votes for nodule characteristics are obtained from ensemble classifiers by using image features. In the second step, votes and rules obtained from radiologist evaluations are used by a weighted rule based method to predict malignancy. The rule based method is constructed by using radiologist evaluations on previous cases. Correlations between malignancy and other nodule characteristics and agreement ratio of radiologists are considered in rule evaluation. To handle the unbalanced nature of LIDC, ensemble classifiers and data balancing methods are used. The proposed approach is compared with the classification methods trained on image features. Classification accuracy, specificity and sensitivity of classifiers are measured. The experimental results show that using nodule characteristics for malignancy prediction can improve classification results.

Keywords: Ensemble classifier; Nodule characteristic; Rule based classification; Unbalanced data.

MeSH terms

  • Algorithms
  • Databases, Factual
  • Diagnosis, Computer-Assisted / methods*
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Lung / diagnostic imaging
  • Lung Neoplasms / diagnosis*
  • Lung Neoplasms / diagnostic imaging*
  • Medical Informatics / methods*
  • Models, Statistical
  • Observer Variation
  • Pattern Recognition, Automated / methods
  • Probability
  • Radiographic Image Interpretation, Computer-Assisted / methods
  • Radiology / methods
  • Radiology Information Systems
  • Reproducibility of Results
  • Semantics
  • Sensitivity and Specificity
  • Solitary Pulmonary Nodule / diagnosis*
  • Solitary Pulmonary Nodule / diagnostic imaging*
  • Support Vector Machine
  • Tomography, X-Ray Computed