Transfer representation learning for medical image analysis

Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug:2015:711-4. doi: 10.1109/EMBC.2015.7318461.

Abstract

There are two major challenges to overcome when developing a classifier to perform automatic disease diagnosis. First, the amount of labeled medical data is typically very limited, and a classifier cannot be effectively trained to attain high disease-detection accuracy. Second, medical domain knowledge is required to identify representative features in data for detecting a target disease. Most computer scientists and statisticians do not have such domain knowledge. In this work, we show that employing transfer learning can remedy both problems. We use Otitis Media (OM) to conduct our case study. Instead of using domain knowledge to extract features from labeled OM images, we construct features based on a dataset entirely OM-irrelevant. More specifically, we first learn a codebook in an unsupervised way from 15 million images collected from ImageNet. The codebook gives us what the encoders consider being the fundamental elements of those 15 million images. We then encode OM images using the codebook and obtain a weighting vector for each OM image. Using the resulting weighting vectors as the feature vectors of the OM images, we employ a traditional supervised learning algorithm to train an OM classifier. The achieved detection accuracy is 88.5% (89.63% in sensitivity and 86.9% in specificity), markedly higher than all previous attempts, which relied on domain experts to help extract features.

MeSH terms

  • Algorithms
  • Image Interpretation, Computer-Assisted*