Analyzing partially paired data: when can the unpaired portion(s) be safely ignored?

Qianya Qi; Li Yan; Lili Tian

doi:10.1080/02664763.2020.1864813

Analyzing partially paired data: when can the unpaired portion(s) be safely ignored?

J Appl Stat. 2020 Dec 23;49(6):1402-1420. doi: 10.1080/02664763.2020.1864813. eCollection 2022.

Authors

Qianya Qi¹, Li Yan², Lili Tian¹

Affiliations

¹ Department of Biostatistics, University at Buffalo, Buffalo, NY, USA.
² Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.

Abstract

Partially paired data, either with incompleteness in one or both arms, are common in practice. For testing equality of means of two arms, practitioners often use only the portion of data with complete pairs and perform paired tests. Although such tests (referred as 'naive paired tests') are legitimate, their powers might be low as only partial data are utilized. The recently proposed 'P-value pooling methods', based on combining P-values from two tests, use all data, have reasonable type-I error control and good power property. While it is generally believed that 'P-value pooling methods' are superior to 'naive paired tests' in terms of power as the former use more data than the latter, no detailed power comparison has been done. This paper aims to compare powers of 'naive paired tests' and 'P-value pooling methods' analytically and our findings are counterintuitive, i.e. the 'P-value pooling methods' do not always outperform the naive paired tests in terms of power. Based on these results, we present guidance on how to select the best test for testing equality of means with partially paired data.

Keywords: Hypothesis testing; P-value; normality; paired data.