Perplexity analysis of obesity news coverage

AMIA Annu Symp Proc. 2009 Nov 14:2009:426-30.

Abstract

An important task performed during the analysis of health news coverage is the identification of news articles that are related to a specific health topic (e.g. obesity). This is often done using a combination of keyword searching and manual encoding of news content. Statistical language models and their evaluation metric, perplexity, may help to automate this task. A perplexity study of obesity news was performed to evaluate perplexity as a measure of the similarity of news corpora to obesity news content. The results of this study showed that perplexity increased as news coverage became more general relative to obesity news (obesity news approximately 187, general health news approximately 278, general news approximately 378, general news across multiple publishers approximately 382). This indicates that language model perplexity can measure the similarity news content to obesity news coverage, and could be used as the basis for an automated health news classifier.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bibliometrics
  • Data Collection
  • Humans
  • Journalism, Medical*
  • Language
  • Mass Media / statistics & numerical data*
  • Obesity*
  • United States