Do Not Divide Count Data with Count Data; A Story from Pollination Ecology with Implications Beyond

PLoS One. 2016 Feb 12;11(2):e0149129. doi: 10.1371/journal.pone.0149129. eCollection 2016.

Abstract

Studies in ecology are often describing observed variations in a certain ecological phenomenon by use of environmental explanatory variables. A common problem is that the numerical nature of the ecological phenomenon does not always fit the assumptions underlying traditional statistical tests. A text book example comes from pollination ecology where flower visits are normally reported as frequencies; number of visits per flower per unit time. Using visitation frequencies in statistical analyses comes with two major caveats: the lack of knowledge on its error distribution and that it does not include all information found in the data; 10 flower visits in 20 flowers is treated the same as recording 100 visits in 200 flowers. We simulated datasets with various "flower visitation distributions" over various numbers of flowers observed (exposure) and with different types of effects inducing variation in the data. The different datasets were then analyzed first with the traditional approach using number of visits per flower and then by using count data models. The analysis of count data gave a much better chance of detecting effects than the traditionally used frequency approach. We conclude that if the data structure, statistical analyses and interpretations of results are mixed up, valuable information can be lost.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Computer Simulation*
  • Ecological and Environmental Phenomena
  • Flowers / physiology
  • Models, Biological*
  • Models, Statistical*
  • Pollination*
  • ROC Curve

Grants and funding

This work was supported by the Norwegian Research Council, 230279/E50. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.