Diabetes detection using deep learning techniques with oversampling and feature augmentation

Comput Methods Programs Biomed. 2021 Apr:202:105968. doi: 10.1016/j.cmpb.2021.105968. Epub 2021 Feb 15.

Abstract

Background and objective: Diabetes is a chronic pathology which is affecting more and more people over the years. It gives rise to a large number of deaths each year. Furthermore, many people living with the disease do not realize the seriousness of their health status early enough. Late diagnosis brings about numerous health problems and a large number of deaths each year so the development of methods for the early diagnosis of this pathology is essential.

Methods: In this paper, a pipeline based on deep learning techniques is proposed to predict diabetic people. It includes data augmentation using a variational autoencoder (VAE), feature augmentation using an sparse autoencoder (SAE) and a convolutional neural network for classification. Pima Indians Diabetes Database, which takes into account information on the patients such as the number of pregnancies, glucose or insulin level, blood pressure or age, has been evaluated.

Results: A 92.31% of accuracy was obtained when CNN classifier is trained jointly the SAE for featuring augmentation over a well balanced dataset. This means an increment of 3.17% of accuracy with respect the state-of-the-art.

Conclusions: Using a full deep learning pipeline for data preprocessing and classification has demonstrate to be very promising in the diabetes detection field outperforming the state-of-the-art proposals.

Keywords: Deep learning; Detection; Diabetes; Oversampling; Sparse autoencoder; Variational autoencoder.

MeSH terms

  • Databases, Factual
  • Deep Learning*
  • Diabetes Mellitus* / diagnosis
  • Humans
  • Neural Networks, Computer