Pervasive errors in hypothesis testing: Toward better statistical practice in nursing research

Vincent S Staggs

doi:10.1016/j.ijnurstu.2019.06.012

Pervasive errors in hypothesis testing: Toward better statistical practice in nursing research

Int J Nurs Stud. 2019 Oct:98:87-93. doi: 10.1016/j.ijnurstu.2019.06.012. Epub 2019 Jul 7.

Author

Vincent S Staggs¹

Affiliation

¹ Biostatistics & Epidemiology Core, Health Services & Outcomes Research, Children's Mercy Kansas City, 2401 Gillham Rd., Kansas City, MO, USA; School of Medicine, University of Missouri-Kansas City, 2411 Holmes St., Kansas City, MO, USA. Electronic address: vstaggs@cmh.edu.

PMID: 31349121
DOI: 10.1016/j.ijnurstu.2019.06.012

Abstract

Background: In recent years several authors have documented common problems in the use of statistics in nursing research, including failure to consider the effects of multiple testing, inattention to clinical significance, and under-reporting of effect sizes and confidence intervals. More subtle forms of multiple testing are not as widely recognized, and abuse of researcher degrees of freedom has received little attention in the nursing research literature. These and other unsound practices in applying and interpreting statistics are problematic in themselves, and they arguably reflect an insufficiently clear understanding of statistical inference as a method for dealing with randomness among many researchers.

Objectives: The goal of this educational paper is to improve the understanding and practice of inferential statistics among nursing researchers. An accessible explanation of hypothesis testing is provided, including discussion of the crucial concept of repeated sampling. Several pervasive mistakes and misconceptions in statistical inference are examined in detail, including misinterpretation of "non-significant" p-values as evidence for the null hypothesis, failure to account for forms of multiple testing that arise in model selection, abuse of researcher degrees of freedom, and hypothesis testing for baseline differences between arms in randomized trials. Recommendations for better statistical practice are offered.

Conclusion: For the foreseeable future classical methods of statistical inference based on the idea of repeated sampling will be the primary tools for quantifying randomness in nursing research. The hypothesis testing framework, despite its limitations, can be helpful in ruling out chance as an explanation for observed effects. Nursing researchers who use quantitative methods, as well as journal reviewers and editors, should understand this framework well. Those involved in educating nursing researchers and those who teach statistics would do well to ask what changes need to be made to raise the level of statistical practice in nursing research.

Keywords: Nursing research; Research methods; Statistical methods; Statistics.

MeSH terms

Data Interpretation, Statistical
Models, Statistical*
Nursing Research*
Research Design