Enhancing reinforcement learning for de novo molecular design applying self-attention mechanisms

Tiago O Pereira; Maryam Abbasi; Joel P Arrais

doi:10.1093/bib/bbad368

Enhancing reinforcement learning for de novo molecular design applying self-attention mechanisms

Brief Bioinform. 2023 Sep 22;24(6):bbad368. doi: 10.1093/bib/bbad368.

Authors

Tiago O Pereira¹, Maryam Abbasi¹, Joel P Arrais¹

Affiliation

¹ Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Univ Coimbra, Coimbra, Portugal.

PMID: 37903414
DOI: 10.1093/bib/bbad368

Abstract

The drug discovery process can be significantly improved by applying deep reinforcement learning (RL) methods that learn to generate compounds with desired pharmacological properties. Nevertheless, RL-based methods typically condense the evaluation of sampled compounds into a single scalar value, making it difficult for the generative agent to learn the optimal policy. This work combines self-attention mechanisms and RL to generate promising molecules. The idea is to evaluate the relative significance of each atom and functional group in their interaction with the target, and to utilize this information for optimizing the Generator. Therefore, the framework for de novo drug design is composed of a Generator that samples new compounds combined with a Transformer-encoder and a biological affinity Predictor that evaluate the generated structures. Moreover, it takes the advantage of the knowledge encapsulated in the Transformer's attention weights to evaluate each token individually. We compared the performance of two output prediction strategies for the Transformer: standard and masked language model (MLM). The results show that the MLM Transformer is more effective in optimizing the Generator compared with the state-of-the-art works. Additionally, the evaluation models identified the most important regions of each molecule for the biological interaction with the target. As a case study, we generated synthesizable hit compounds that can be putative inhibitors of the enzyme ubiquitin-specific protein 7 (USP7).

Keywords: deep learning; drug design; reinforcement learning; smiles; transformer.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Drug Design*
Drug Discovery
Learning*