Long-term causal effects estimation via latent surrogates representation learning

Ruichu Cai; Weilin Chen; Zeqin Yang; Shu Wan; Chen Zheng; Xiaoqing Yang; Jiecheng Guo

doi:10.1016/j.neunet.2024.106336

Long-term causal effects estimation via latent surrogates representation learning

Neural Netw. 2024 Apr 29:176:106336. doi: 10.1016/j.neunet.2024.106336. Online ahead of print.

Authors

Ruichu Cai¹, Weilin Chen², Zeqin Yang², Shu Wan³, Chen Zheng⁴, Xiaoqing Yang⁴, Jiecheng Guo⁴

Affiliations

¹ School of Computer Science, Guangdong University of Technology, Guangzhou, China. Electronic address: cairuichu@gdut.edu.cn.
² School of Computer Science, Guangdong University of Technology, Guangzhou, China.
³ School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA.
⁴ Didi Chuxing, Beijing, China.

PMID: 38703421
DOI: 10.1016/j.neunet.2024.106336

Abstract

Estimating long-term causal effects based on short-term surrogates is a significant but challenging problem in many real-world applications such as marketing and medicine. Most existing methods estimate causal effects in an idealistic and simplistic manner - disregarding unobserved surrogates and treating all short-term outcomes as surrogates. However, such methods are not well-suited to real-world scenarios where the partially observed surrogates are mixed with the proxies of unobserved surrogates among short-term outcomes. To address this issue, we develop our flexible method called LASER to estimate long-term causal effects in a more realistic situation where the surrogates are either observed or have observed proxies. In LASER, we employ an identifiable variational autoencoder to learn the latent surrogate representation by using all the surrogate candidates without the need to distinguish observed surrogates or proxies of unobserved surrogates. With the learned representation, we further devise a theoretically guaranteed and unbiased estimation of long-term causal effects. Extensive experimental results on the real-world and semi-synthetic datasets demonstrate the effectiveness of our proposed method.

Keywords: Identifiable variational autoencoder; LASER; Long-term causal effects; Surrogates.