An attention mechanism-based LSTM network for cancer kinase activity prediction

SAR QSAR Environ Res. 2022 Aug;33(8):631-647. doi: 10.1080/1062936X.2022.2109062.

Abstract

Despite the endeavours and achievements made in treating cancers during the past decades, resistance to available kinase drugs continues to be a major problem in cancer therapies. Thus, it is highly desirable to develop computational models that can predict the bioactivity of a compound against cancer kinases. Here, we present a Long Short-Term Memory (LSTM) framework for predicting the activities of lead molecules against seven different kinases. A total of 14,907 compounds from the ChEMBL database were selected for model building. Two different molecular representations, namely, 2D descriptors and MACCS fingerprints were subjected to the LSTM method for the training process. We also successfully integrated an attention mechanism into our model, which helped us to interpret the contribution of chemical features on kinase activity. The attention mechanism extracted the significant chemical moieties more effectively by taking them into consideration during the activity prediction. The recorded accuracies in the test sets for both 2D descriptors and MACCS fingerprints-based models were 0.81 and 0.78, respectively. The receiver operating characteristic curve (ROC)-area under the curve (AUC) score for both models was in the range of 0.8-0.99. The proposed framework can be a good starting point for the development of new cancer kinase drugs.

Keywords: Deep learning; LSTM; attention mechanism; cancer kinases.

MeSH terms

  • Humans
  • Neoplasms* / drug therapy
  • Quantitative Structure-Activity Relationship*
  • ROC Curve