Relationship Between State-Level Google Online Search Volume and Cancer Incidence in the United States: Retrospective Study

J Med Internet Res. 2018 Jan 8;20(1):e6. doi: 10.2196/jmir.8870.

Abstract

Background: In the United States, cancer is common, with high morbidity and mortality; cancer incidence varies between states. Online searches reflect public awareness, which could be driven by the underlying regional cancer epidemiology.

Objective: The objective of our study was to characterize the relationship between cancer incidence and online Google search volumes in the United States for 6 common cancers. A secondary objective was to evaluate the association of search activity with cancer-related public events and celebrity news coverage.

Methods: We performed a population-based, retrospective study of state-level cancer incidence from 2004 through 2013 reported by the Centers for Disease Control and Prevention for breast, prostate, colon, lung, and uterine cancers and leukemia compared to Google Trends (GT) relative search volume (RSV), a metric designed by Google to allow interest in search topics to be compared between regions. Participants included persons in the United States who searched for cancer terms on Google. The primary measures were the correlation between annual state-level cancer incidence and RSV as determined by Spearman correlation and linear regression with RSV and year as independent variables and cancer incidence as the dependent variable. Temporal associations between search activity and events raising public awareness such as cancer awareness months and cancer-related celebrity news were described.

Results: At the state level, RSV was significantly correlated to incidence for breast (r=.18, P=.001), prostate (r=-.27, P<.001), lung (r=.33, P<.001), and uterine cancers (r=.39, P<.001) and leukemia (r=.13, P=.003) but not colon cancer (r=-.02, P=.66). After adjusting for time, state-level RSV was positively correlated to cancer incidence for all cancers: breast (P<.001, 95% CI 0.06 to 0.19), prostate (P=.38, 95% CI -0.08 to 0.22), lung (P<.001, 95% CI 0.33 to 0.46), colon (P<.001, 95% CI 0.11 to 0.17), and uterine cancers (P<.001, 95% CI 0.07 to 0.12) and leukemia (P<.001, 95% CI 0.01 to 0.03). Temporal associations in GT were noted with breast cancer awareness month but not with other cancer awareness months and celebrity events.

Conclusions: Cancer incidence is correlated with online search volume at the state level. Search patterns were temporally associated with cancer awareness months and celebrity announcements. Online searches reflect public awareness. Advancing understanding of online search patterns could augment traditional epidemiologic surveillance, provide opportunities for targeted patient engagement, and allow public information campaigns to be evaluated in ways previously unable to be measured.

Keywords: Google; Internet; cancer; incidence; infodemiology.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Awareness
  • Humans
  • Incidence
  • Internet / standards*
  • Neoplasms / epidemiology*
  • Retrospective Studies
  • Search Engine / methods*
  • United States