Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database

Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul:2017:3110-3113. doi: 10.1109/EMBC.2017.8037515.

Abstract

Electronic medical claims (EMCs) can be used to accurately predict the occurrence of a variety of diseases, which can contribute to precise medical interventions. While there is a growing interest in the application of machine learning (ML) techniques to address clinical problems, the use of deep-learning in healthcare have just gained attention recently. Deep learning, such as deep neural network (DNN), has achieved impressive results in the areas of speech recognition, computer vision, and natural language processing in recent years. However, deep learning is often difficult to comprehend due to the complexities in its framework. Furthermore, this method has not yet been demonstrated to achieve a better performance comparing to other conventional ML algorithms in disease prediction tasks using EMCs. In this study, we utilize a large population-based EMC database of around 800,000 patients to compare DNN with three other ML approaches for predicting 5-year stroke occurrence. The result shows that DNN and gradient boosting decision tree (GBDT) can result in similarly high prediction accuracies that are better compared to logistic regression (LR) and support vector machine (SVM) approaches. Meanwhile, DNN achieves optimal results by using lesser amounts of patient data when comparing to GBDT method.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Databases, Factual
  • Humans
  • Machine Learning*
  • Neural Networks, Computer
  • Stroke
  • Support Vector Machine