Unsupervised Event Graph Representation and Similarity Learning on Biomedical Literature

Giacomo Frisoni; Gianluca Moro; Giulio Carlassare; Antonella Carbonaro

doi:10.3390/s22010003

Unsupervised Event Graph Representation and Similarity Learning on Biomedical Literature

Sensors (Basel). 2021 Dec 21;22(1):3. doi: 10.3390/s22010003.

Authors

Giacomo Frisoni¹, Gianluca Moro¹, Giulio Carlassare², Antonella Carbonaro¹

Affiliations

¹ Department of Computer Science and Engineering (DISI), University of Bologna, 40126 Bologna, Italy.
² Independent Researcher, 48018 Faenza, Italy.

Abstract

The automatic extraction of biomedical events from the scientific literature has drawn keen interest in the last several years, recognizing complex and semantically rich graphical interactions otherwise buried in texts. However, very few works revolve around learning embeddings or similarity metrics for event graphs. This gap leaves biological relations unlinked and prevents the application of machine learning techniques to promote discoveries. Taking advantage of recent deep graph kernel solutions and pre-trained language models, we propose Deep Divergence Event Graph Kernels (DDEGK), an unsupervised inductive method to map events into low-dimensional vectors, preserving their structural and semantic similarities. Unlike most other systems, DDEGK operates at a graph level and does not require task-specific labels, feature engineering, or known correspondences between nodes. To this end, our solution compares events against a small set of anchor ones, trains cross-graph attention networks for drawing pairwise alignments (bolstering interpretability), and employs transformer-based models to encode continuous attributes. Extensive experiments have been done on nine biomedical datasets. We show that our learned event representations can be effectively employed in tasks such as graph classification, clustering, and visualization, also facilitating downstream semantic textual similarity. Empirical results demonstrate that DDEGK significantly outperforms other state-of-the-art methods.

Keywords: biomedical text mining; event embedding; event extraction; graph kernels; graph neural networks; graph representation learning; graph similarity learning; metric learning.

MeSH terms

Cluster Analysis
Machine Learning*
Publications
Semantics*