Comparison between optimized MaxEnt and random forest modeling in predicting potential distribution: A case study with Quasipaa boulengeri in China

Sci Total Environ. 2022 Oct 10:842:156867. doi: 10.1016/j.scitotenv.2022.156867. Epub 2022 Jun 22.

Abstract

Random forest (RF) and MaxEnt models are shallow machine learning approaches that perform well in predicting species' potential distributions. RF models can produce robust results with the default automatic configuration in most cases, but it is necessary for MaxEnt to optimize the model settings to improve the performance, and the predictive performance difference between optimized MaxEnt and RF is uncertain. To explore this issue, the potential distribution of the endangered amphibian Quasipaa boulengeri in China was predicted using optimized MaxEnt and RF models. A total of 408 occurrence data were selected, 1000 locations were generated as pseudo-absence data by the geographic distance method, and 10,000 sites were selected as background data by creating a bias file. Partial ROC at different thresholds and success rate curves were used to compare the predictive performances between optimized MaxEnt and RF. Our results showed that the RF and optimized MaxEnt models both had good performance in predicting the potential distribution of Q. boulengeri, with the RF model performing slightly better whether based on partial ROC or success rate curves. Furthermore, the core suitable habitat regions of Q. boulengeri identified by RF and MaxEnt were similar and were all located in the Sichuan, Chongqing, Hubei, Hunan, and Guizhou provinces. However, the RF model produced a habitat suitability map with higher discrimination and greater heterogeneity. Temperature annual range, mean temperature of the driest quarter, and annual precipitation were the vital environmental variables limiting the distribution of Q. boulengeri. The RF model is the stronger machine learner. We believe it may be more applicable in predicting the native potential distributions of species with sufficient occurrence data, given the additional predictive detail, the simplicity of use, the computational time involved, and the operational complexity.

Keywords: Optimized MaxEnt model; Overfitting; Quasipaa boulengeri; Random forest; Species distribution models.

MeSH terms

  • Animals
  • Anura*
  • China
  • Ecosystem*
  • Temperature