Learning meaningful latent space representations for patient risk stratification: Model development and validation for dengue and other acute febrile illness

Bernard Hernandez; Oliver Stiff; Damien K Ming; Chanh Ho Quang; Vuong Nguyen Lam; Tuan Nguyen Minh; Chau Nguyen Van Vinh; Nguyet Nguyen Minh; Huy Nguyen Quang; Lam Phung Khanh; Tam Dong Thi Hoai; Trung Dinh The; Trieu Huynh Trung; Bridget Wills; Cameron P Simmons; Alison H Holmes; Sophie Yacoub; Pantelis Georgiou; Vietnam ICU Translational Applications Laboratory (VITAL) investigators

doi:10.3389/fdgth.2023.1057467

Learning meaningful latent space representations for patient risk stratification: Model development and validation for dengue and other acute febrile illness

Front Digit Health. 2023 Feb 22:5:1057467. doi: 10.3389/fdgth.2023.1057467. eCollection 2023.

Authors

Bernard Hernandez^{1

2}, Oliver Stiff¹, Damien K Ming^{2

3}, Chanh Ho Quang⁴, Vuong Nguyen Lam^{4

5}, Tuan Nguyen Minh⁶, Chau Nguyen Van Vinh^{4

7}, Nguyet Nguyen Minh⁴, Huy Nguyen Quang⁴, Lam Phung Khanh^{4

5}, Tam Dong Thi Hoai⁴, Trung Dinh The⁴, Trieu Huynh Trung^{4

7}, Bridget Wills^{4

8}, Cameron P Simmons⁹, Alison H Holmes^{2

3}, Sophie Yacoub^{4

8}, Pantelis Georgiou^{1

2}; Vietnam ICU Translational Applications Laboratory (VITAL) investigators

Affiliations

¹ Centre for Bio-Inspired Technology, Imperial College London, London, United Kingdom.
² Centre for Amtimicrobial Optimisation, Imperial College London, London, United Kingdom.
³ NIHR HPRU in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, United Kingdom.
⁴ Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.
⁵ University of Medicine and Pharmacy, Ho Chi Minh City, Vietnam.
⁶ Children's Hospital No 1, Ho Chi Minh City, Vietnam.
⁷ Hospital for Tropical Diseases, Ho Chi Minh City, Vietnam.
⁸ Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁹ Institute of Vector Borne Disease, Monash University, Melbourne, VIC, Australia.

Abstract

Background: Increased data availability has prompted the creation of clinical decision support systems. These systems utilise clinical information to enhance health care provision, both to predict the likelihood of specific clinical outcomes or evaluate the risk of further complications. However, their adoption remains low due to concerns regarding the quality of recommendations, and a lack of clarity on how results are best obtained and presented.

Methods: We used autoencoders capable of reducing the dimensionality of complex datasets in order to produce a 2D representation denoted as latent space to support understanding of complex clinical data. In this output, meaningful representations of individual patient profiles are spatially mapped in an unsupervised manner according to their input clinical parameters. This technique was then applied to a large real-world clinical dataset of over 12,000 patients with an illness compatible with dengue infection in Ho Chi Minh City, Vietnam between 1999 and 2021. Dengue is a systemic viral disease which exerts significant health and economic burden worldwide, and up to 5% of hospitalised patients develop life-threatening complications.

Results: The latent space produced by the selected autoencoder aligns with established clinical characteristics exhibited by patients with dengue infection, as well as features of disease progression. Similar clinical phenotypes are represented close to each other in the latent space and clustered according to outcomes broadly described by the World Health Organisation dengue guidelines. Balancing distance metrics and density metrics produced results covering most of the latent space, and improved visualisation whilst preserving utility, with similar patients grouped closer together. In this case, this balance is achieved by using the sigmoid activation function and one hidden layer with three neurons, in addition to the latent dimension layer, which produces the output (Pearson, 0.840; Spearman, 0.830; Procrustes, 0.301; GMM 0.321).

Conclusion: This study demonstrates that when adequately configured, autoencoders can produce two-dimensional representations of a complex dataset that conserve the distance relationship between points. The output visualisation groups patients with clinically relevant features closely together and inherently supports user interpretability. Work is underway to incorporate these findings into an electronic clinical decision support system to guide individual patient management.

Keywords: autoencoder (AE) neural networks; clinical decision support system (CDSS); dengue; similarity retrieval; unsupervised learning; visualisation.

© 2023 Hernandez, Stiff, Ming, Ho Quang, Nguyen Lam, Nguyen Minh, Nguyen Van Vinh, Nguyen Minh, Nguyen Quang, Phung Khanh, Dong Thi Hoai, Dinh The, Huynh Trung, Wills, Simmons, Holmes, Yacoub and Georgiou.

Grants and funding

This work was supported by the Wellcome Trust grant (215010/Z/18/Z); DH and BH receive their salaries from and are supported by the grant. The funding source had no role in the design, data collection, analysis or writing of the manuscript.