The Sign Test, Paired Data, and Asymmetric Dependence: A Cautionary Tale

Am Stat. 2023;77(1):35-40. doi: 10.1080/00031305.2022.2110938. Epub 2022 Sep 23.

Abstract

In the paired data setting, the sign test is often described in statistical textbooks as a test for comparing differences between the medians of two marginal distributions. There is an implicit assumption that the median of the differences is equivalent to the difference of the medians when employing the sign test in this fashion. We demonstrate however that given asymmetry in the bivariate distribution of the paired data, there are often scenarios where the median of the differences is not equal to the difference of the medians. Further, we show that these scenarios will lead to a false interpretation of the sign test for its intended use in the paired data setting. We illustrate the false-interpretation concept via theory, a simulation study, and through a real-world example based on breast cancer RNA sequencing data obtained from the Cancer Genome Atlas (TCGA).

Keywords: empirical likelihood; exchangeability; nonparametric statistics; re-randomization; statistical bias; symmetry.