Correlation Analysis to Identify the Effective Data in Machine Learning: Prediction of Depressive Disorder and Emotion States

Int J Environ Res Public Health. 2018 Dec 19;15(12):2907. doi: 10.3390/ijerph15122907.

Abstract

Correlation analysis is an extensively used technique that identifies interesting relationships in data. These relationships help us realize the relevance of attributes with respect to the target class to be predicted. This study has exploited correlation analysis and machine learning-based approaches to identify relevant attributes in the dataset which have a significant impact on classifying a patient's mental health status. For mental health situations, correlation analysis has been performed in Weka, which involves a dataset of depressive disorder symptoms and situations based on weather conditions, as well as emotion classification based on physiological sensor readings. Pearson's product moment correlation and other different classification algorithms have been utilized for this analysis. The results show interesting correlations in weather attributes for bipolar patients, as well as in features extracted from physiological data for emotional states.

Keywords: correlation analysis; data analytics; health care; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Decision Trees
  • Depressive Disorder / diagnosis*
  • Diagnosis, Computer-Assisted / methods*
  • Diagnosis, Computer-Assisted / statistics & numerical data*
  • Emotions*
  • Female
  • Humans
  • Machine Learning
  • Male
  • Middle Aged