Machine Learning, Sentiment Analysis, and Tweets: An Examination of Alzheimer's Disease Stigma on Twitter

J Gerontol B Psychol Sci Soc Sci. 2017 Sep 1;72(5):742-751. doi: 10.1093/geronb/gbx014.

Abstract

Objectives: Social scientists need practical methods for harnessing large, publicly available datasets that inform the social context of aging. We describe our development of a semi-automated text coding method and use a content analysis of Alzheimer's disease (AD) and dementia portrayal on Twitter to demonstrate its use. The approach improves feasibility of examining large publicly available datasets.

Method: Machine learning techniques modeled stigmatization expressed in 31,150 AD-related tweets collected via Twitter's search API based on 9 AD-related keywords. Two researchers manually coded 311 random tweets on 6 dimensions. This input from 1% of the dataset was used to train a classifier against the tweet text and code the remaining 99% of the dataset.

Results: Our automated process identified that 21.13% of the AD-related tweets used AD-related keywords to perpetuate public stigma, which could impact stereotypes and negative expectations for individuals with the disease and increase "excess disability".

Discussion: This technique could be applied to questions in social gerontology related to how social media outlets reflect and shape attitudes bearing on other developmental outcomes. Recommendations for the collection and analysis of large Twitter datasets are discussed.

Keywords: Attitudes; Data mining; Social media; Stigma.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Ageism / psychology*
  • Algorithms*
  • Alzheimer Disease / psychology*
  • Data Mining
  • Disability Evaluation
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Public Opinion*
  • Social Media*
  • Social Stigma*
  • United States