Automated Analysis of the US Drought Monitor Maps With Machine Learning and Multiple Drought Indicators

Pouyan Hatami Bahman Beiglou; Lifeng Luo; Pang-Ning Tan; Lisi Pei

doi:10.3389/fdata.2021.750536

Automated Analysis of the US Drought Monitor Maps With Machine Learning and Multiple Drought Indicators

Front Big Data. 2021 Oct 25:4:750536. doi: 10.3389/fdata.2021.750536. eCollection 2021.

Authors

Pouyan Hatami Bahman Beiglou¹, Lifeng Luo¹, Pang-Ning Tan², Lisi Pei¹

Affiliations

¹ Department of Geography, Environment, and Spatial Sciences, College of Social Science, Michigan State University, East Lansing, MI, United States.
² Department of Computer Science and Engineering, College of Engineering, Michigan State University, East Lansing, MI, United States.

Abstract

The US Drought Monitor (USDM) is a hallmark in real time drought monitoring and assessment as it was developed by multiple agencies to provide an accurate and timely assessment of drought conditions in the US on a weekly basis. The map is built based on multiple physical indicators as well as reported observations from local contributors before human analysts combine the information and produce the drought map using their best judgement. Since human subjectivity is included in the production of the USDM maps, it is not an entirely clear quantitative procedure for other entities to reproduce the maps. In this study, we developed a framework to automatically generate the maps through a machine learning approach by predicting the drought categories across the domain of study. A persistence model served as the baseline model for comparison in the framework. Three machine learning algorithms, logistic regression, random forests, and support vector machines, with four different groups of input data, which formed an overall of 12 different configurations, were used for the prediction of drought categories. Finally, all the configurations were evaluated against the baseline model to select the best performing option. The results showed that our proposed framework could reproduce the drought maps to a near-perfect level with the support vector machines algorithm and the group 4 data. The rest of the findings of this study can be highlighted as: 1) employing the past week drought data as a predictor in the models played an important role in achieving high prediction scores, 2) the nonlinear models, random forest, and support vector machines had a better overall performance compared to the logistic regression models, and 3) with borrowing the neighboring grid cells information, we could compensate the lack of training data in the grid cells with insufficient historical USDM data particularly for extreme and exceptional drought conditions.

Keywords: SVM–support vector machines; USDM; drought indices; drought monitoring; logistic regression; machine learning; random forest.