Heterogeneous data integration methods for patient similarity networks

Jessica Gliozzo; Marco Mesiti; Marco Notaro; Alessandro Petrini; Alex Patak; Antonio Puertas-Gallardo; Alberto Paccanaro; Giorgio Valentini; Elena Casiraghi

doi:10.1093/bib/bbac207

Heterogeneous data integration methods for patient similarity networks

Brief Bioinform. 2022 Jul 18;23(4):bbac207. doi: 10.1093/bib/bbac207.

Authors

Jessica Gliozzo^{1

2

3}, Marco Mesiti^{1

3}, Marco Notaro^{1

3}, Alessandro Petrini^{1

3}, Alex Patak², Antonio Puertas-Gallardo², Alberto Paccanaro^{4

5}, Giorgio Valentini^{1

3

6

7}, Elena Casiraghi^{1

3}

Affiliations

¹ AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.
² European Commission, Joint Research Centre (JRC), Ispra (VA), Italy.
³ CINI, Infolife National Laboratory, Roma, Italy.
⁴ Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX UK.
⁵ School of Applied Mathematics (EMAp), Fundação Getúlio Vargas, Rio de Janeiro Brazil.
⁶ DSRC UNIMI, Data Science Research Center, Milano, 20135, Italy.
⁷ ELLIS, European Laboratory for Learning and Intelligent Systems, Berlin, Germany.

Abstract

Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.

Keywords: biomedical applications; data fusion; multimodal data; patient similarity networks.

Heterogeneous data integration methods for patient similarity networks

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding