CCAE: Cross-field categorical attributes embedding for cancer clinical endpoint prediction

Artif Intell Med. 2020 Jul:107:101915. doi: 10.1016/j.artmed.2020.101915. Epub 2020 Jun 26.

Abstract

Patients with advanced cancer are burdened physically and psychologically, so there is an urgent need to pay more attention to their health-related quality of life (HRQOL). With an expected clinical endpoint prediction, over-treatment can be effectively eliminated by the means of palliative care at the right time. This paper develops a deep learning based approach for cancer clinical endpoint prediction based on patient's electronic health records (EHR). Due to the pervasive existence of categorical information in EHR, it brings unavoidably obstacles to the effective numerical learning algorithms. To address this issue, we propose a novel cross-field categorical attributes embedding (CCAE) model to learn a vectorized representation for cancer patients in attribute-level by orders, in which the strong semantic coupling among categorical variables are well exploited. By transforming the order-dependency modeling into a sequence learning task in an ingenious way, recurrent neural network is adopted to capture the semantic relevance among multi-order representations. Experimental results from the SEER-Medicare EHR dataset have illustrated that the proposed model can achieve competitive prediction performance compared with other baselines.

Keywords: Categorical variables embedding; Clinical endpoint prediction; Electronic health records.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Electronic Health Records
  • Humans
  • Medicare
  • Neoplasms* / diagnosis
  • Neoplasms* / therapy
  • Neural Networks, Computer
  • Quality of Life*
  • United States