CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering

Nucleic Acids Res. 2015 Jan;43(Database issue):D558-65. doi: 10.1093/nar/gku1212. Epub 2014 Nov 20.

Abstract

Similarity-based clustering and classification of compounds enable the search of drug leads and the structural and chemogenomic studies for facilitating chemical, biomedical, agricultural, material and other industrial applications. A database that organizes compounds into similarity-based as well as scaffold-based and property-based families is useful for facilitating these tasks. CFam Chemical Family database http://bidd2.cse.nus.edu.sg/cfam was developed to hierarchically cluster drugs, bioactive molecules, human metabolites, natural products, patented agents and other molecules into functional families, superfamilies and classes of structurally similar compounds based on the literature-reported high, intermediate and remote similarity measures. The compounds were represented by molecular fingerprint and molecular similarity was measured by Tanimoto coefficient. The functional seeds of CFam families were from hierarchically clustered drugs, bioactive molecules, human metabolites, natural products, patented agents, respectively, which were used to characterize families and cluster compounds into families, superfamilies and classes. CFam currently contains 11,643 classes, 34,880 superfamilies and 87,136 families of 490,279 compounds (1691 approved drugs, 1228 clinical trial drugs, 12,386 investigative drugs, 262,881 highly active molecules, 15,055 human metabolites, 80,255 ZINC-processed natural products and 116,783 patented agents). Efforts will be made to further expand CFam database and add more functional categories and families based on other types of molecular representations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Products / classification
  • Cluster Analysis
  • Databases, Chemical*
  • Drug Discovery*
  • Humans
  • Internet
  • Pharmaceutical Preparations / classification

Substances

  • Biological Products
  • Pharmaceutical Preparations