A random forest algorithm-based approach to capture latent decision variables and their cutoff values

J Biomed Inform. 2020 Oct:110:103548. doi: 10.1016/j.jbi.2020.103548. Epub 2020 Aug 28.

Abstract

Although reference intervals (RIs) and clinical decision limits (CDLs) are vital laboratory information for supporting the interpretation of numerical clinical pathology results, there is evidence that RIs and CDLs vary in certain contexts as well as other evidence that RIs and CDLs are flawed. We propose a random forest algorithm-based exploration methodology by using phenotype transformation of independent variables in relation to dependent variables to capture latent decision variables and their cutoff values. We denote certain CDLs within the RIs estimated by an indirect method that affect some diagnostics or outcomes in the context of specific patients' conditions as latent CDLs. We then apply the proposed methodology to clinical laboratory data regarding bodily fluids, such as blood, urine at the admission of patients for the exploration of latent CDLs of hospital length of stay (HLOS) for each patients' condition identified by diseases of patients who undergoing surgeries. From the exploration results, we found that free Thyroxine (T4) above five unique cutoff values: 1.16 ng/dL, 1.19 ng/dL, 1.2 ng/dL, 1.23 ng/dL and 1.25 ng/dL for tachyarrhythmia predicted longer HLOS, though these cutoff values fall within the estimated RIs as well as the hospital-determined RIs. In addition to the evidence that higher free Thyroxine (T4) levels within the RIs have an association with the corresponding disease, on the whole, the cutoff values except 1.16 ng/dL tended to affect long HLOS with the significant differences. The cutoff values could be taken up for discussion among clinical experts whether it is meaningful to alert the risk of patients' conditions and the long HLOS at the admission of patients. If clinical experts appreciate its meaningfulness in clinical practice, the alerts could be embedded in electronic medical records for handling those risks at the admission of patients.

Keywords: Clinical laboratory data; Cutoff values; Knowledge discovery; Latent decision variables; Phenotype transformation; Random forests.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Clinical Laboratory Services*
  • Electronic Health Records*
  • Humans
  • Reference Values