Survey data on earnings tend to contain measurement error. Administrative data are superior in principle, but they are worthless in case of a mismatch. We develop methods for prediction in mixture factor analysis models that combine both data sources to arrive at a single earnings figure. We apply the methods to a Swedish data set. Our results show that register earnings data perform poorly if there is a (small) probability of a mismatch. Survey earnings data are more reliable, despite their measurement error. Predictors that combine both and take conditional class probabilities into account outperform all other predictors.
Keywords: Factor score; administrative data; finite mixture; structural equation model; validation study.