Statistical requirements for properly investigating a null hypothesis

Psychol Rep. 2010 Dec;107(3):953-71. doi: 10.2466/02.03.17.21.PR0.107.6.953-971.

Abstract

Issues involved in the evaluation of null hypotheses are discussed. The use of equivalence testing is recommended as a possible alternative to the use of simple t or F tests for evaluating a null hypothesis. When statistical power is low and larger sample sizes are not available or practical, consideration should be given to using one-tailed tests or less conservative levels for determining criterion levels of statistical significance. Effect sizes should always be reported along with significance levels, as both are needed to understand results of research. Probabilities alone are not enough and are especially problematic for very large or very small samples. Pre-existing group differences should be tested and properly accounted for when comparing independent groups on dependent variables. If confirmation of a null hypothesis is expected, potential suppressor variables should be considered. If different methods are used to select the samples to be compared, controls for social desirability bias should be implemented. When researchers deviate from these standards or appear to assume that such standards are unimportant or irrelevant, their results should be deemed less credible than when such standards are maintained and followed. Several examples of recent violations of such standards in family social science, comparing gay, lesbian, bisexual, and transgender families with heterosexual families, are provided. Regardless of their political values or expectations, researchers should strive to test null hypotheses rigorously, in accordance with the best professional standards.

MeSH terms

  • Female
  • Humans
  • Male
  • Research / statistics & numerical data*
  • Sample Size
  • Sexuality / statistics & numerical data
  • Statistics as Topic / standards*