High throughput analysis of breast cancer specimens on the grid

Med Image Comput Comput Assist Interv. 2007;10(Pt 1):617-25. doi: 10.1007/978-3-540-75757-3_75.

Abstract

Breast cancer accounts for about 30% of all cancers and 15% of all cancer deaths in women in the United States. Advances in computer assisted diagnosis (CAD) holds promise for early detecting and staging disease progression. In this paper we introduce a Grid-enabled CAD to perform automatic analysis of imaged histopathology breast tissue specimens. More than 100,000 digitized samples (1200 x 1200 pixels) have already been processed on the Grid. We have analyzed results for 3744 breast tissue samples, which were originated from four different institutions using diaminobenzidine (DAB) and hematoxylin staining. Both linear and nonlinear dimension reduction techniques are compared, and the best one (ISOMAP) was applied to reduce the dimensionality of the features. The experimental results show that the Gentle Boosting using an eight node CART decision tree as the weak learner provides the best result for classification. The algorithm has an accuracy of 86.02% using only 20% of the specimens as the training set.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Artificial Intelligence
  • Breast Neoplasms / diagnostic imaging*
  • Diagnosis, Computer-Assisted / methods*
  • Female
  • Humans
  • Image Enhancement / methods*
  • Image Interpretation, Computer-Assisted / methods*
  • Information Storage and Retrieval / methods
  • Internet*
  • Pattern Recognition, Automated / methods*
  • Radiographic Image Enhancement / methods
  • Radiographic Image Interpretation, Computer-Assisted / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Signal Processing, Computer-Assisted
  • User-Computer Interface