Confidence intervals of the Mann-Whitney parameter that are compatible with the Wilcoxon-Mann-Whitney test

Michael P Fay; Yaakov Malinovsky

doi:10.1002/sim.7890

Confidence intervals of the Mann-Whitney parameter that are compatible with the Wilcoxon-Mann-Whitney test

Stat Med. 2018 Nov 30;37(27):3991-4006. doi: 10.1002/sim.7890. Epub 2018 Jul 8.

Authors

Michael P Fay¹, Yaakov Malinovsky²

Affiliations

¹ Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, Rockville, Maryland.
² Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland.

Abstract

For the two-sample problem, the Wilcoxon-Mann-Whitney (WMW) test is used frequently: it is simple to explain (a permutation test on the difference in mean ranks), it handles continuous or ordinal responses, it can be implemented for large or small samples, it is robust to outliers, it requires few assumptions, and it is efficient in many cases. Unfortunately, the WMW test is rarely presented with an effect estimate and confidence interval. A natural effect parameter associated with this test is the Mann-Whitney parameter, φ = Pr[ X<Y ] + 0.5 Pr[X = Y ]. Ideally, we desire confidence intervals on φ that are compatible with the WMW test, meaning the test rejects at level α if and only if the 100(1 - α)% confidence interval on the Mann-Whitney parameter excludes 1/2. Existing confidence interval procedures on φ are not compatible with the usual asymptotic implementation of the WMW test that uses a continuity correction nor are they compatible with exact WMW tests. We develop compatible confidence interval procedures for the asymptotic WMW tests and confidence interval procedures for some exact WMW tests that appear to be compatible. We discuss assumptions and interpretation of the resulting tests and confidence intervals. We provide the wmwTest function of the asht R package to calculate all of the developed confidence intervals.

Keywords: Mann-Whitney U test; Wilcoxon rank sum test; area under the curve; probabilistic index; receiver operating characteristic curve.

Published 2018. This article is a U.S. Government work and is in the public domain in the USA.

MeSH terms

Confidence Intervals*
Humans
Models, Statistical
Reproducibility of Results
Statistics, Nonparametric*

Grants and funding

Z99 AI999999/Intramural NIH HHS/United States