Predictive Analytics with Strategically Missing Data

INFORMS J Comput. 2020 Fall;32(4):1143-1156. doi: 10.1287/ijoc.2019.0947. Epub 2020 May 22.

Abstract

We study strategically missing data problems in predictive analytics with regression. In many real-world situations, such as financial reporting, college admission, job application, and marketing advertisement, data providers often conceal certain information on purpose in order to gain a favorable outcome. It is important for the decision-maker to have a mechanism to deal with such strategic behaviors. We propose a novel approach to handle strategically missing data in regression prediction. The proposed method derives imputation values of strategically missing data based on the Support Vector Regression models. It provides incentives for the data providers to disclose their true information. We show that with the proposed method imputation errors for the missing values are minimized under some reasonable conditions. An experimental study on real-world data demonstrates the effectiveness of the proposed approach.

Keywords: business analytics; data manipulation; information disclosure; strategic learning; support vector regression.