A discussion on significance indices for contingency tables under small sample sizes

Natalia L Oliveira; Carlos A de B Pereira; Marcio A Diniz; Adriano Polpo

doi:10.1371/journal.pone.0199102

A discussion on significance indices for contingency tables under small sample sizes

PLoS One. 2018 Aug 2;13(8):e0199102. doi: 10.1371/journal.pone.0199102. eCollection 2018.

Authors

Natalia L Oliveira¹, Carlos A de B Pereira², Marcio A Diniz³, Adriano Polpo³

Affiliations

¹ Department of Statistics and Data Science, Carnegie Mellon Univesity, Pittsburgh, United States of America.
² Department of Statistics, University of Sao Paulo, Sao Paulo, Brazil.
³ Department of Statistics, Federal University of Sao Carlos, Sao Carlos, Brazil.

Abstract

Hypothesis testing in contingency tables is usually based on asymptotic results, thereby restricting its proper use to large samples. To study these tests in small samples, we consider the likelihood ratio test (LRT) and define an accurate index for the celebrated hypotheses of homogeneity, independence, and Hardy-Weinberg equilibrium. The aim is to understand the use of the asymptotic results of the frequentist Likelihood Ratio Test and the Bayesian FBST (Full Bayesian Significance Test) under small-sample scenarios. The proposed exact LRT p-value is used as a benchmark to understand the other indices. We perform analysis in different scenarios, considering different sample sizes and different table dimensions. The conditional Fisher's exact test for 2 × 2 tables and the Barnard's exact test are also discussed. The main message of this paper is that all indices have very similar behavior, except for Fisher and Barnard tests that has a discrete behavior. The most powerful test was the asymptotic p-value from the likelihood ratio test, suggesting that is a good alternative for small sample sizes.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bayes Theorem
Benchmarking* / methods
Benchmarking* / statistics & numerical data
Chi-Square Distribution
Data Interpretation, Statistical*
Humans
Likelihood Functions
Models, Statistical*
Research Design
Sample Size

Grants and funding

This work was partially supported by the Brazilian agencies FAPESP grant 2012/16669-4, and CNPq grants 302767/2017-7 and 308776/2014-3. The agencies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.