Investigating the congruence of crowdsourced information with official government data: the case of pediatric clinics

J Med Internet Res. 2014 Feb 3;16(2):e29. doi: 10.2196/jmir.3078.

Abstract

Background: Health 2.0 is a benefit to society by helping patients acquire knowledge about health care by harnessing collective intelligence. However, any misleading information can directly affect patients' choices of hospitals and drugs, and potentially exacerbate their health condition.

Objective: This study investigates the congruence between crowdsourced information and official government data in the health care domain and identifies the determinants of low congruence where it exists. In-line with infodemiology, we suggest measures to help the patients in the regions vulnerable to inaccurate health information.

Methods: We text-mined multiple online health communities in South Korea to construct the data for crowdsourced information on public health services (173,748 messages). Kendall tau and Spearman rank order correlation coefficients were used to compute the differences in 2 ranking systems of health care quality: actual government evaluations of 779 hospitals and mining results of geospecific online health communities. Then we estimated the effect of sociodemographic characteristics on the level of congruence by using an ordinary least squares regression.

Results: The regression results indicated that the standard deviation of married women's education (P=.046), population density (P=.01), number of doctors per pediatric clinic (P=.048), and birthrate (P=.002) have a significant effect on the congruence of crowdsourced data (adjusted R²=.33). Specifically, (1) the higher the birthrate in a given region, (2) the larger the variance in educational attainment, (3) the higher the population density, and (4) the greater the number of doctors per clinic, the more likely that crowdsourced information from online communities is congruent with official government data.

Conclusions: To investigate the cause of the spread of misleading health information in the online world, we adopted a unique approach by associating mining results on hospitals from geospecific online health communities with the sociodemographic characteristics of corresponding regions. We found that the congruence of crowdsourced information on health care services varied across regions and that these variations could be explained by geospecific demographic factors. This finding can be helpful to governments in reducing the potential risk of misleading online information and the accompanying safety issues.

Keywords: crowdsourcing; online health community; public health; risk of misinformation.

MeSH terms

  • Anti-Bacterial Agents / therapeutic use
  • Child
  • Child Health Services / standards*
  • Crowdsourcing*
  • Data Mining
  • Delivery of Health Care
  • Federal Government
  • Hospitals, Pediatric / standards*
  • Hospitals, Urban / standards
  • Humans
  • Least-Squares Analysis
  • Online Systems
  • Pediatrics / standards*
  • Republic of Korea
  • Socioeconomic Factors
  • Unnecessary Procedures / statistics & numerical data

Substances

  • Anti-Bacterial Agents