Describe Molecules by a Heterogeneous Graph Neural Network with Transformer-like Attention for Supervised Property Predictions

Daiguo Deng; Zengrong Lei; Xiaobin Hong; Ruochi Zhang; Fengfeng Zhou

doi:10.1021/acsomega.1c06389

Describe Molecules by a Heterogeneous Graph Neural Network with Transformer-like Attention for Supervised Property Predictions

ACS Omega. 2022 Jan 21;7(4):3713-3721. doi: 10.1021/acsomega.1c06389. eCollection 2022 Feb 1.

Authors

Daiguo Deng¹, Zengrong Lei¹, Xiaobin Hong¹, Ruochi Zhang^{1

2}, Fengfeng Zhou³

Affiliations

¹ Fermion Technology Co., Limited, Guangzhou, Guangdong 510000, P. R. China.
² School of Artificial Intelligence, Jilin University, Changchun 130012, P. R. China.
³ College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, P. R. China.

Abstract

Machine learning and deep learning have facilitated various successful studies of molecular property predictions. The rapid development of natural language processing and graph neural network (GNN) further pushed the state-of-the-art prediction performance of molecular property to a new level. A geometric graph could describe a molecular structure with atoms as the nodes and bonds as the edges. Therefore, a graph neural network may be trained to better represent a molecular structure. The existing GNNs assumed homogeneous types of atoms and bonds, which may miss important information between different types of atoms or bonds. This study represented a molecule using a heterogeneous graph neural network (MolHGT), in which there were different types of nodes and different types of edges. A transformer reading function of virtual nodes was proposed to aggregate all the nodes, and a molecule graph may be represented from the hidden states of the virtual nodes. This proof-of-principle study demonstrated that the proposed MolHGT network improved the existing studies of molecular property predictions. The source code and the training/validation/test splitting details are available at https://github.com/zhangruochi/Mol-HGT.