Evaluating meteorological data from weather stations, and from satellites and global models for a multi-site epidemiological study

Environ Res. 2018 Aug:165:91-109. doi: 10.1016/j.envres.2018.02.027. Epub 2018 Apr 21.

Abstract

Background: Longitudinal and time series analyses are needed to characterize the associations between hydrometeorological parameters and health outcomes. Earth Observation (EO) climate data products derived from satellites and global model-based reanalysis have the potential to be used as surrogates in situations and locations where weather-station based observations are inadequate or incomplete. However, these products often lack direct evaluation at specific sites of epidemiological interest.

Methods: Standard evaluation metrics of correlation, agreement, bias and error were applied to a set of ten hydrometeorological variables extracted from two quasi-global, commonly used climate data products - the Global Land Data Assimilation System (GLDAS) and Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) - to evaluate their performance relative to weather-station derived estimates at the specific geographic locations of the eight sites in a multi-site cohort study. These metrics were calculated for both daily estimates and 7-day averages and for a rotavirus-peak-season subset. Then the variables from the two sources were each used as predictors in longitudinal regression models to test their association with rotavirus infection in the cohort after adjusting for covariates.

Results: The availability and completeness of station-based validation data varied depending on the variable and study site. The performance of the two gridded climate models varied considerably within the same location and for the same variable across locations, according to different evaluation criteria and for the peak-season compared to the full dataset in ways that showed no obvious pattern. They also differed in the statistical significance of their association with the rotavirus outcome. For some variables, the station-based records showed a strong association while the EO-derived estimates showed none, while for others, the opposite was true.

Conclusion: Researchers wishing to utilize publicly available climate data - whether EO-derived or station based - are advised to recognize their specific limitations both in the analysis and the interpretation of the results. Epidemiologists engaged in prospective research into environmentally driven diseases should install their own weather monitoring stations at their study sites whenever possible, in order to circumvent the constraints of choosing between distant or incomplete station data or unverified EO estimates.

Trial registration: ClinicalTrials.gov NCT02441426.

Keywords: Climate; Earth Observation data; Environmental epidemiology; Meteorological data; Rotavirus.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bangladesh
  • Cohort Studies
  • Data Analysis
  • Epidemiologic Studies*
  • Meteorology* / instrumentation
  • Meteorology* / standards
  • Models, Statistical*
  • Spacecraft*
  • Weather*

Associated data

  • ClinicalTrials.gov/NCT02441426