Data-driven techniques for temperature data prediction: big data analytics approach

Environ Monit Assess. 2023 Jan 30;195(2):343. doi: 10.1007/s10661-023-10961-z.

Abstract

For extrapolation, climate change and other meteorological analysis, a study of past and current weather events is a prerequisite. NASA (National Aeronautics and Space Administration) has been able to develop a model capable of predicting various weather data for any location on the Earth, including locations lacking weather stations, weather satellite coverage, and other weather measuring instruments. This paper evaluates the prediction accuracy of the NASA temperature data with respect to NiMet (Nigerian Meteorological Agency) ground truth measurement, using Akwa Ibom Airport as a case study. Exploratory data analysis (descriptive and diagnostic analyses) of temperature retrieved from NiMet and NASA was performed to give a clear path to follow for predictive and prescriptive analyses. Using 2783 days of weather data retrieved from NiMet as ground truth, the accuracy of NASA predictions with the corresponding resolution was calculated. Mean absolute error (MAE) of 2.184 °C and root mean square error (RMSE) of 2.579 °C, with a coefficient of determination (R2) of 0.710 for maximum temperature, then MAE of 0.876 °C, RMSE of 1.225 °C with a coefficient of determination (R2) of 0.620 for minimum temperature was discovered. There is a good correlation between the two datasets; hence, a model can be developed to generate more accurate predictions, using the NASA data as input. Predictive and prescriptive analyses were performed by employing five prediction algorithms: decision tree regression, XGBoost regression and MLP (multilayer perceptron) with LBFGS (limited-memory Broyden-Fletcher-Goldfarb-Shanno) optimizer, MLP with SGD (stochastic gradient) optimizer and MLP with Adam optimizer. The MLP LBFGS algorithm performed best, by significantly reducing the MAE by 35.35% and RMSE by 31.06% for maximum temperature, accordingly, MAE by 10.05% and RMSE by 8.00% for minimum temperature. Results obtained show that given sufficient data, plugging NASA predictions as input to an LBFGS-MLP model gives more accurate temperature predictions for the study area.

Keywords: Artificial neural network; Correlation analysis; Descriptive and diagnostic analyses; Predictive and prescriptive analytics; Weather prediction.

MeSH terms

  • Algorithms
  • Data Science*
  • Environmental Monitoring*
  • Temperature
  • Weather