Convolutional neural networks improve species distribution modelling by capturing the spatial structure of the environment

PLoS Comput Biol. 2021 Apr 19;17(4):e1008856. doi: 10.1371/journal.pcbi.1008856. eCollection 2021 Apr.

Abstract

Convolutional Neural Networks (CNNs) are statistical models suited for learning complex visual patterns. In the context of Species Distribution Models (SDM) and in line with predictions of landscape ecology and island biogeography, CNN could grasp how local landscape structure affects prediction of species occurrence in SDMs. The prediction can thus reflect the signatures of entangled ecological processes. Although previous machine-learning based SDMs can learn complex influences of environmental predictors, they cannot acknowledge the influence of environmental structure in local landscapes (hence denoted "punctual models"). In this study, we applied CNNs to a large dataset of plant occurrences in France (GBIF), on a large taxonomical scale, to predict ranked relative probability of species (by joint learning) to any geographical position. We examined the way local environmental landscapes improve prediction by performing alternative CNN models deprived of information on landscape heterogeneity and structure ("ablation experiments"). We found that the landscape structure around location crucially contributed to improve predictive performance of CNN-SDMs. CNN models can classify the predicted distributions of many species, as other joint modelling approaches, but they further prove efficient in identifying the influence of local environmental landscapes. CNN can then represent signatures of spatially structured environmental drivers. The prediction gain is noticeable for rare species, which open promising perspectives for biodiversity monitoring and conservation strategies. Therefore, the approach is of both theoretical and practical interest. We discuss the way to test hypotheses on the patterns learnt by CNN, which should be essential for further interpretation of the ecological processes at play.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biodiversity*
  • France
  • Models, Statistical*
  • Neural Networks, Computer*
  • Plants / classification*

Grants and funding

This study was possible thanks to the financial support of the Labex Numev and the French National Research Agency under the Investments for the Future Program, referred as ANR-16-CONV-0004. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.