Development and operation of a digital platform for sharing pathology image data

BMC Med Inform Decis Mak. 2021 Apr 3;21(1):114. doi: 10.1186/s12911-021-01466-1.

Abstract

Background: Artificial intelligence (AI) research is highly dependent on the nature of the data available. With the steady increase of AI applications in the medical field, the demand for quality medical data is increasing significantly. We here describe the development of a platform for providing and sharing digital pathology data to AI researchers, and highlight challenges to overcome in operating a sustainable platform in conjunction with pathologists.

Methods: Over 3000 pathological slides from five organs (liver, colon, prostate, pancreas and biliary tract, and kidney) in histologically confirmed tumor cases by pathology departments at three hospitals were selected for the dataset. After digitalizing the slides, tumor areas were annotated and overlaid onto the images by pathologists as the ground truth for AI training. To reduce the pathologists' workload, AI-assisted annotation was established in collaboration with university AI teams.

Results: A web-based data sharing platform was developed to share massive pathological image data in 2019. This platform includes 3100 images, and 5 pre-processing algorithms for AI researchers to easily load images into their learning models.

Discussion: Due to different regulations among countries for privacy protection, when releasing internationally shared learning platforms, it is considered to be most prudent to obtain consent from patients during data acquisition.

Conclusions: Despite limitations encountered during platform development and model training, the present medical image sharing platform can steadily fulfill the high demand of AI developers for quality data. This study is expected to help other researchers intending to generate similar platforms that are more effective and accessible in the future.

Keywords: Artificial intelligence-assisted annotation; Digital pathology; Medical image dataset; Open platform.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Humans
  • Male
  • Neoplasms*