Explainable machine learning for the prediction and assessment of complex drought impacts

Sci Total Environ. 2023 Nov 10:898:165509. doi: 10.1016/j.scitotenv.2023.165509. Epub 2023 Jul 17.

Abstract

Drought is a common and costly natural disaster with broad social, economic, and environmental impacts. Machine learning (ML) has been widely applied in scientific research because of its outstanding performance on predictive tasks. However, for practical applications like disaster monitoring and assessment, the cost of the models failure, especially false negative predictions, might significantly affect society. Stakeholders are not satisfied with or do not "trust" the predictions from a so-called black box. The explainability of ML models becomes progressively crucial in studying drought and its impacts. In this work, we propose an explainable ML pipeline using the XGBoost model and SHAP model based on a comprehensive database of drought impacts in the U.S. The XGBoost models significantly outperformed the baseline models in predicting the occurrence of multi-dimensional drought impacts derived from the text-based Drought Impact Reporter, attaining an average F2 score of 0.883 at the national level and 0.942 at the state level. The interpretation of the models at the state scale indicates that the Standardized Precipitation Index (SPI) and Standardized Temperature Index (STI) contribute significantly to predicting multi-dimensional drought impacts. The time scalar, importance, and relationships of the SPI and STI vary depending on the types of drought impacts and locations. The patterns between the SPI variables and drought impacts indicated by the SHAP values reveal an expected relationship in which negative SPI values positively contribute to complex drought impacts. The explainability based on the SPI variables improves the trustworthiness of the XGBoost models. Overall, this study reveals promising results in accurately predicting complex drought impacts and rendering the relationships between the impacts and indicators more interpretable. This study also reveals the potential of utilizing explainable ML for the general social good to help stakeholders better understand the multi-dimensional drought impacts at the regional level and motivate appropriate responses.

Keywords: Drought; Explainable AI; Impact assessment; Machine learning.