Unsupervised class labeling of diffuse lung diseases using frequent attribute patterns

Int J Comput Assist Radiol Surg. 2017 Mar;12(3):519-528. doi: 10.1007/s11548-016-1476-2. Epub 2016 Aug 30.

Abstract

Purpose: For realizing computer-aided diagnosis (CAD) of computed tomography (CT) images, many pattern recognition methods have been applied to automatic classification of normal and abnormal opacities; however, for the learning of accurate classifier, a large number of images with correct labels are necessary. It is a very time-consuming and impractical task for radiologists to give correct labels for a large number of CT images. In this paper, to solve the above problem and realize an unsupervised class labeling mechanism without using correct labels, a new clustering algorithm for diffuse lung diseases using frequent attribute patterns is proposed.

Methods: A large number of frequently appeared patterns of opacities are extracted by a data mining algorithm named genetic network programming (GNP), and the extracted patterns are automatically distributed to several clusters using genetic algorithm (GA). In this paper, lung CT images are used to make clusters of normal and diffuse lung diseases.

Results: After executing the pattern extraction by GNP, 1,148 frequent attribute patterns were extracted; then, GA was executed to make clusters. This paper deals with making clusters of normal and five kinds of abnormal opacities (i.e., six-class problem), and then, the proposed method without using correct class labels in the training showed 47.7 % clustering accuracy.

Conclusion: It is clarified that the proposed method can make clusters without using correct labels and has the potential to apply to CAD, reducing the time cost for labeling CT images.

Keywords: Clustering; Computer-aided diagnosis; Data mining; Diffuse lung diseases; Evolutionary computation; Unsupervised learning.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Data Mining
  • Diagnosis, Computer-Assisted / methods*
  • Humans
  • Lung / diagnostic imaging*
  • Lung Diseases / diagnostic imaging*
  • Pattern Recognition, Automated / methods*
  • Tomography, X-Ray Computed / methods*
  • Unsupervised Machine Learning*