Machine Learning to Support the Presentation of Complex Pathway Graphs

IEEE/ACM Trans Comput Biol Bioinform. 2021 May-Jun;18(3):1130-1141. doi: 10.1109/TCBB.2019.2938501. Epub 2021 Jun 3.

Abstract

Visualization of biological mechanisms by means of pathway graphs is necessary to better understand the often complex underlying system. Manual layout of such pathways or maps of knowledge is a difficult and time consuming process. Node duplication is a technique that makes layouts with improved readability possible by reducing edge crossings and shortening edge lengths in drawn diagrams. In this article, we propose an approach using Machine Learning (ML) to facilitate parts of this task by training a Support Vector Machine (SVM) with actions taken during manual biocuration. Our training input is a series of incremental snapshots of a diagram describing mechanisms of a disease, progressively curated by a human expert employing node duplication in the process. As a test of the trained SVM models, they are applied to a single large instance and 25 medium-sized instances of hand-curated biological pathways. Finally, in a user validation study, we compare the model predictions to the outcome of a node duplication questionnaire answered by users of biological pathways with varying experience. We successfully predicted nodes for duplication and emulated human choices, demonstrating that our approach can effectively learn human-like node duplication preferences to support curation of pathway diagrams in various contexts.

MeSH terms

  • Computational Biology / methods*
  • Data Display
  • Humans
  • Machine Learning*
  • Models, Biological*
  • Signal Transduction
  • Support Vector Machine