Estimating Incremental Validity Under Missing Data

Dustin A Fife; Jorge L Mendoza; Christopher M Berry

doi:10.1080/00273171.2016.1259099

Estimating Incremental Validity Under Missing Data

Multivariate Behav Res. 2017 Mar-Apr;52(2):164-177. doi: 10.1080/00273171.2016.1259099. Epub 2016 Dec 20.

Authors

Dustin A Fife¹, Jorge L Mendoza², Christopher M Berry³

Affiliations

¹ a Department of Psychology , Rowan University.
² b Department of Psychology , University of Oklahoma.
³ c Kelley School of Business , Indiana University.

PMID: 27997223
DOI: 10.1080/00273171.2016.1259099

Abstract

A common form of missing data is caused by selection on an observed variable (e.g., Z). If the selection variable was measured and is available, the data are regarded as missing at random (MAR). Selection biases correlation, reliability, and effect size estimates when these estimates are computed on listwise deleted (LD) data sets. On the other hand, maximum likelihood (ML) estimates are generally unbiased and outperform LD in most situations, at least when the data are MAR. The exception is when we estimate the partial correlation. In this situation, LD estimates are unbiased when the cause of missingness is partialled out. In other words, there is no advantage of ML estimates over LD estimates in this situation. We demonstrate that under a MAR condition, even ML estimates may become biased, depending on how partial correlations are computed. Finally, we conclude with recommendations about how future researchers might estimate partial correlations even when the cause of missingness is unknown and, perhaps, unknowable.

Keywords: Missing data; incremental validity; listwise deletion; maximum likelihood.

MeSH terms

Algorithms
Computer Simulation
Data Interpretation, Statistical*
Educational Status
Humans
Likelihood Functions*
Monte Carlo Method
Multivariate Analysis*
Reproducibility of Results
Socioeconomic Factors
Students
Universities