Padhoc: a computational pipeline for pathway reconstruction on the fly

Bioinformatics. 2020 Dec 30;36(Suppl_2):i795-i803. doi: 10.1093/bioinformatics/btaa811.

Abstract

Motivation: Molecular pathway databases represent cellular processes in a structured and standardized way. These databases support the community-wide utilization of pathway information in biological research and the computational analysis of high-throughput biochemical data. Although pathway databases are critical in genomics research, the fast progress of biomedical sciences prevents databases from staying up-to-date. Moreover, the compartmentalization of cellular reactions into defined pathways reflects arbitrary choices that might not always be aligned with the needs of the researcher. Today, no tool exists that allow the easy creation of user-defined pathway representations.

Results: Here we present Padhoc, a pipeline for pathway ad hoc reconstruction. Based on a set of user-provided keywords, Padhoc combines natural language processing, database knowledge extraction, orthology search and powerful graph algorithms to create navigable pathways tailored to the user's needs. We validate Padhoc with a set of well-established Escherichia coli pathways and demonstrate usability to create not-yet-available pathways in model (human) and non-model (sweet orange) organisms.

Availability and implementation: Padhoc is freely available at https://github.com/ConesaLab/padhoc.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology*
  • Databases, Factual
  • Genomics
  • Humans
  • Software*