Computational Literature-based Discovery for Natural Products Research: Current State and Future Prospects

Andreas Lardos; Ahmad Aghaebrahimian; Anna Koroleva; Julia Sidorova; Evelyn Wolfram; Maria Anisimova; Manuel Gil

doi:10.3389/fbinf.2022.827207

Computational Literature-based Discovery for Natural Products Research: Current State and Future Prospects

Front Bioinform. 2022 Mar 15:2:827207. doi: 10.3389/fbinf.2022.827207. eCollection 2022.

Authors

Andreas Lardos¹, Ahmad Aghaebrahimian^{2

3}, Anna Koroleva^{2

3}, Julia Sidorova⁴, Evelyn Wolfram¹, Maria Anisimova^{2

3}, Manuel Gil^{2

3}

Affiliations

¹ Natural Product Chemistry and Phytopharmacy Research Group, Institute of Chemistry and Biotechnology, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), Waedenswil, Switzerland.
² Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland.
³ Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
⁴ Instituto de Tecnología del Conocimiento, Universidad Complutense de Madrid, Madrid, Spain.

Abstract

Literature-based discovery (LBD) mines existing literature in order to generate new hypotheses by finding links between previously disconnected pieces of knowledge. Although automated LBD systems are becoming widespread and indispensable in a wide variety of knowledge domains, little has been done to introduce LBD to the field of natural products research. Despite growing knowledge in the natural product domain, most of the accumulated information is found in detached data pools. LBD can facilitate better contextualization and exploitation of this wealth of data, for example by formulating new hypotheses for natural product research, especially in the context of drug discovery and development. Moreover, automated LBD systems promise to accelerate the currently tedious and expensive process of lead identification, optimization, and development. Focusing on natural product research, we briefly reflect the development of automated LBD and summarize its methods and principal data sources. In a thorough review of published use cases of LBD in the biomedical domain, we highlight the immense potential of this data mining approach for natural product research, especially in context with drug discovery or repurposing, mode of action, as well as drug or substance interactions. Most of the 91 natural product-related discoveries in our sample of reported use cases of LBD were addressed at a computer science audience. Therefore, it is the wider goal of this review to introduce automated LBD to researchers who work with natural products and to facilitate the dialogue between this community and the developers of automated LBD systems.

Keywords: knowledge graph; literature-based discovery; natural language processing; natural products; ontology; semantic integration; swanson; text mining.

Publication types

Review