Construction of longitudinal prediction targets using semisupervised learning

Stat Methods Med Res. 2018 Sep;27(9):2674-2693. doi: 10.1177/0962280216684163. Epub 2017 Jan 8.

Abstract

In establishing prognostic models, often aided by machine learning methods, much effort is concentrated in identifying good predictors. However, the same level of rigor is often absent in improving the outcome side of the models. In this study, we focus on this rather neglected aspect of model development. We are particularly interested in the use of longitudinal information as a way of improving the outcome side of prognostic models. This involves optimally characterizing individuals' outcome status, classifying them, and validating the formulated prediction targets. None of these tasks are straightforward, which may explain why longitudinal prediction targets are not commonly used in practice despite their compelling benefits. As a way of improving this situation, we explore the joint use of empirical model fitting, clinical insights, and cross-validation based on how well formulated targets are predicted by clinically relevant baseline characteristics (antecedent validators). The idea here is that all these methods are imperfect but can be used together to triangulate valid prediction targets. The proposed approach is illustrated using data from the longitudinal assessment of manic symptoms study.

Keywords: Prognostic model; clinical threshold; cross-validation; latent trajectory class; semisupervised learning.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bipolar Disorder / physiopathology
  • Machine Learning*
  • Models, Statistical
  • Outcome Assessment, Health Care* / statistics & numerical data
  • Precision Medicine
  • Prognosis*