The Influence of Region of Interest Heterogeneity on Classification Accuracy in Wetland Systems

Remote Sens (Basel). 2019 Mar 6;11(5):551. doi: 10.3390/rs11050551.

Abstract

Classifying and mapping natural systems such as wetlands using remote sensing frequently relies on data derived from regions of interest (ROIs), often acquired during field campaigns. ROIs tend to be heterogeneous in complex systems with a variety of land cover classes. However, traditional supervised image classification is predicated on pure single-class observations to train a classifier. This ultimately encourages end-users to create single-class ROIs, nudging ROIs away from field-based points or gerrymandering the ROI, which may produce ROIs unrepresentative of the landscape and potentially insert error into the classification. In this study, we explored WorldView-2 images and 228 field-based data points to define ROIs of varying heterogeneity levels in terms of class membership to classify and map 22 discrete classes in a large and complex wetland system. The goal was to include rather than avoid ROI heterogeneity and assess its impact on classification accuracy. Parametric and nonparametric classifiers were tested with ROI heterogeneity that varied from 7% to 100%. Heterogeneity was governed by ROI area, which we increased from the field-sampling frame of ~100 m2 nearly 19-fold to ~2124 m2. In general, overall accuracy (OA) tended downwards with increasing heterogeneity but stayed relatively high until extreme heterogeneity levels were reached. Moreover, the differences in OA were not statistically significant across several small-to-large heterogeneity levels. Per-class user's and producer's accuracies behaved similarly. Our findings suggest that ROI heterogeneity did not harm classification accuracy unless heterogeneity became extreme, and thus there are substantial practical advantages to accommodating heterogeneous ROIs in image classification. Rather than attempting to avoid ROI heterogeneity by gerrymandering, classification in wetland environments, as well as analyses of other complex environments, should embrace ROI heterogeneity.

Keywords: Lake Baikal; Selenga river delta; Worldview-2; general linear model (GLM); gerrymandering; methods; mixed pixels; multinomial linear model (MLM); random forest (RF); support vector machine (SVM).