Sample coverage estimation, rarefaction, and extrapolation based on sample-based abundance data

Ecology. 2023 Aug;104(8):e4099. doi: 10.1002/ecy.4099. Epub 2023 Jun 20.

Abstract

Sample coverage, the proportion of individuals that belong to observed species in a sample, is a metric used to measure the completeness of a sample. Rather than using equal sample sizes, equal sample coverage has become a widely accepted standard for comparing diversity across multiple assemblages, resulting in a more accurate representation of the true relationship between the richness of the assemblages. In practice, sample-based abundance data are the most frequently used data type for evaluating species diversity. In sample-based abundance data, the sampling unit (e.g., a plot, net, trap, or transect) is randomly selected from the target area, and the number of individuals for each species observed in the sampled unit is recorded. In this case, the individuals in the sample are no longer randomly and independently sampled, and the Good-Turing estimators of abundance-based sample coverage in reference, rarefied, and extrapolated samples may be severely biased when individuals present a highly spatially aggregated pattern. Here, I derive a novel estimator of abundance-based sample coverage based on the Good-Turing frequency formula. Additionally, a new analytical approach is introduced for enabling smooth coverage-based rarefaction and extrapolation to compare richness among assemblages. The near unbiasedness of the proposed estimator and a less biased richness ratio achieved using the newly developed coverage-based standardizing approach are demonstrated by analyzing three ForestGEO permanent forest plot data sets.

Keywords: ForestGeo; Good-Turing frequency formula; extrapolation; rarefaction; sample coverage; sample-based abundance data; spatial aggregation pattern.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biodiversity*
  • Forests
  • Humans
  • Models, Biological*
  • Sample Size