Domain-adversarial multi-task framework for novel therapeutic property prediction of compounds

Bioinformatics. 2020 May 1;36(9):2848-2855. doi: 10.1093/bioinformatics/btaa063.

Abstract

Motivation: With the rapid development of high-throughput technologies, parallel acquisition of large-scale drug-informatics data provides significant opportunities to improve pharmaceutical research and development. One important application is the purpose prediction of small-molecule compounds with the objective of specifying the therapeutic properties of extensive purpose-unknown compounds and repurposing the novel therapeutic properties of FDA-approved drugs. Such a problem is extremely challenging because compound attributes include heterogeneous data with various feature patterns, such as drug fingerprints, drug physicochemical properties and drug perturbation gene expressions. Moreover, there is a complex non-linear dependency among heterogeneous data. In this study, we propose a novel domain-adversarial multi-task framework for integrating shared knowledge from multiple domains. The framework first uses an adversarial strategy to learn target representations and then models non-linear dependency among several domains.

Results: Experiments on two real-world datasets illustrate that our approach achieves an obvious improvement over competitive baselines. The novel therapeutic properties of purpose-unknown compounds that we predicted have been widely reported or brought to clinics. Furthermore, our framework can integrate various attributes beyond the three domains examined herein and can be applied in industry for screening significant numbers of small-molecule drug candidates.

Availability and implementation: The source code and datasets are available at https://github.com/JohnnyY8/DAMT-Model.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Drug Repositioning*
  • High-Throughput Screening Assays*
  • Software*