S2DV: converting SMILES to a drug vector for predicting the activity of anti-HBV small molecules

Brief Bioinform. 2022 Mar 10;23(2):bbab593. doi: 10.1093/bib/bbab593.

Abstract

In the past few decades, chronic hepatitis B caused by hepatitis B virus (HBV) has been one of the most serious diseases to human health. The development of innovative systems is essential for preventing the complex pathogenesis of hepatitis B and reducing side effects caused by drugs. HBV inhibitory drugs have been developed through various compounds, and they are often limited by routine experimental screening and delay drug development. More recently, virtual screening of compounds has gradually been used in drug research with strong computational capability and is further applied in anti-HBV drug screening, thus facilitating a reliable drug screening process. However, the lack of structural information in traditional compound analysis is an important hurdle for unsatisfactory efficiency in drug screening. Here, a natural language processing technique was adopted to analyze compound simplified molecular input line entry system strings. By using the targeted optimized word2vec model for pretraining, we can accurately represent the relationship between the compound and its substructure. The machine learning model based on training results can effectively predict the inhibitory effect of compounds on HBV and liver toxicity. The reliability of the model is verified by the results of wet-lab experiments. In addition, a tool has been published to predict potential compounds. Hence, this article provides a new perspective on the prediction of compound properties for anti-HBV drugs that can help improve hepatitis B diagnosis and further develop human health in the future.

Keywords: SMILES; drug discovery; hepatitis B virus; natural language processing; virtual screening; word embedding.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antiviral Agents / pharmacology
  • Antiviral Agents / therapeutic use
  • Drug Discovery / methods
  • Hepatitis B virus*
  • Hepatitis B* / drug therapy
  • Humans
  • Reproducibility of Results

Substances

  • Antiviral Agents