Predicting the distribution of out-of-reach biotopes with decision trees in a Swedish marine protected area

Ecol Appl. 2012 Dec;22(8):2248-64. doi: 10.1890/11-1608.1.

Abstract

Through spatially explicit predictive models, knowledge of spatial patterns of biota can be generated for out-of-reach environments, where there is a paucity of survey data. This knowledge is invaluable for conservation decisions. We used distribution modeling to predict the occurrence of benthic biotopes, or megafaunal communities of the seabed, to support the spatial planning of a marine national park. Nine biotope classes were obtained prior to modeling from multivariate species data derived from point source, underwater imagery. Five map layers relating to depth and terrain were used as predictor variables. Biotope type was predicted on a pixel-by-pixel basis, where pixel size was 15 x 15 m and total modeled area was 455 km2. To choose a suitable modeling technique we compared the performance of five common models based on recursive partitioning: two types of classification and regression trees ([1] pruned by 10-fold cross-validation and [2] pruned by minimizing complexity), random forests, conditional inference (CI) trees, and CI forests. The selected model was a CI forest (an ensemble of CI trees), a machine-learning technique whose discriminatory power (class-by-class area under the curve [AUC] ranged from 0.75 to 0.86) and classification accuracy (72%) surpassed those of the other methods tested. Conditional inference trees are virtually new to the field of ecology. The final model's overall prediction error was 28%. Model predictions were also checked against a custom-built measure of dubiousness, calculated at the polygon level. Key factors other than the choice of modeling technique include: the use of a multinomial response, accounting for the heterogeneity of observations, and spatial autocorrelation. To illustrate how the model results can be implemented in spatial planning, representation of biodiversity in the national park was described and quantified. Given a goal of maximizing classification accuracy, we conclude that conditional inference trees are a promising tool to map biota. Species distribution modeling is presented as an ecological tool that can handle a wide variety of systems (e.g., the benthic system).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Conservation of Natural Resources / methods*
  • Demography
  • Ecosystem*
  • Fishes / physiology*
  • Geographic Information Systems
  • Models, Biological*
  • Oceans and Seas
  • Sweden