Machine-learning enhancement of urine dipstick tests for chronic kidney disease detection

J Am Med Inform Assoc. 2023 May 19;30(6):1114-1124. doi: 10.1093/jamia/ocad051.

Abstract

Objective: Screening for chronic kidney disease (CKD) requires an estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) from a blood sample and a proteinuria level from a urinalysis. We developed machine-learning models to detect CKD without blood collection, predicting an eGFR less than 60 (eGFR60 model) or 45 (eGFR45 model) using a urine dipstick test.

Materials and methods: The electronic health record data (n = 220 018) obtained from university hospitals were used for XGBoost-derived model construction. The model variables were age, sex, and 10 measurements from the urine dipstick test. The models were validated using health checkup center data (n = 74 380) and nationwide public data (KNHANES data, n = 62 945) for the general population in Korea.

Results: The models comprised 7 features, including age, sex, and 5 urine dipstick measurements (protein, blood, glucose, pH, and specific gravity). The internal and external areas under the curve (AUCs) of the eGFR60 model were 0.90 or higher, and a higher AUC for the eGFR45 model was obtained. For the eGFR60 model on KNHANES data, the sensitivity was 0.93 or 0.80, and the specificity was 0.86 or 0.85 in ages less than 65 with proteinuria (nondiabetes or diabetes, respectively). Nonproteinuric CKD could be detected in nondiabetic patients under the age of 65 with a sensitivity of 0.88 and specificity of 0.71.

Discussion and conclusions: The model performance differed across subgroups by age, proteinuria, and diabetes. The CKD progression risk can be assessed with the eGFR models using the levels of eGFR decrease and proteinuria. The machine-learning-enhanced urine-dipstick test can become a point-of-care test to promote public health by screening CKD and ranking its risk of progression.

Keywords: XGBoost; chronic kidney disease; estimated glomerular filtration rate; machine-learning model; urinalysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Creatinine / urine
  • Diabetes Mellitus*
  • Glomerular Filtration Rate
  • Humans
  • Proteinuria / diagnosis
  • Proteinuria / epidemiology
  • Proteinuria / urine
  • Renal Insufficiency, Chronic* / diagnosis
  • Urinalysis

Substances

  • Creatinine