Using Big Data Techniques to Improve Prostate Cancer Reporting in the Gauteng Province, South Africa

Stud Health Technol Inform. 2019 Aug 21:264:1437-1438. doi: 10.3233/SHTI190472.

Abstract

Prostate cancer (PCa) data is of public health importance in South Africa. Biopsy data is recorded as semi-structured narrative text that is not easily analysed. Our study reports a pilot study that applied predictive analytics and text mining techniques to extract prognostic information that guides patient management. In particular, the Gleason score (GS) reported in a number of formats were extracted successfully. Our study reports that predominantly older men were diagnosed with PCa reporting a high-risk GS (8-10). Where cell differentiation was reported, 64% of biopsies reported poor differentiation. The approaches demonstrated in our study should be extended to a larger dataset to assess whether it has the potential to scale up to the national level.

Keywords: Cell differentiation; Gleason score; Prostate cancer; Risk.

MeSH terms

  • Big Data*
  • Humans
  • Male
  • Neoplasm Grading
  • Pilot Projects
  • Prostatic Neoplasms*
  • South Africa