A Fragment Library of Natural Products and its Comparative Chemoinformatic Characterization

Mol Inform. 2020 Nov;39(11):e2000050. doi: 10.1002/minf.202000050. Epub 2020 Apr 29.

Abstract

We report a comprehensive fragment library with 205,903 fragments derived from the recently published Collection of Open Natural Products (COCONUT) data set with more than 400,000 non-redundant natural products. The natural products-based fragment library was compared with other two fragment libraries herein generated from ChEMBL (biologically relevant compounds) and Enamine-REAL (a large on-demand collection of synthetic compounds), both used as reference data sets with relevance in drug discovery. It was found that there is a large diversity of unique fragments derived from natural products and that the entire structures and fragments derived from natural products are more diverse and structurally complex than the two reference compound collections. During this work we introduced a novel visual representation of the chemical space based on the recently published concept of statistical-based database fingerprint. The compounds and fragments libraries from natural products generated and analyzed in this work are freely available.

Keywords: ChEMBL; drug discovery; fingerprint; fragment; natural product.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Products / analysis*
  • Cheminformatics*
  • Databases as Topic
  • Small Molecule Libraries / analysis*

Substances

  • Biological Products
  • Small Molecule Libraries

Associated data

  • figshare/10.6084/m9.figshare.11997951