Virtual screening using binary kernel discrimination: effect of noisy training data and the optimization of performance

Beining Chen; Robert F Harrison; Kitsuchart Pasupa; Peter Willett; David J Wilton; David J Wood; Xiao Qing Lewell

doi:10.1021/ci0505426

Virtual screening using binary kernel discrimination: effect of noisy training data and the optimization of performance

J Chem Inf Model. 2006 Mar-Apr;46(2):478-86. doi: 10.1021/ci0505426.

Authors

Beining Chen¹, Robert F Harrison, Kitsuchart Pasupa, Peter Willett, David J Wilton, David J Wood, Xiao Qing Lewell

Affiliation

¹ Department of Chemistry, University of Sheffield, Sheffield S10 2TN, UK.

PMID: 16562975
DOI: 10.1021/ci0505426

Abstract

Binary kernel discrimination (BKD) uses a training set of compounds, for which structural and qualitative activity data are available, to produce a model that can then be applied to the structures of other compounds in order to predict their likely activity. Experiments with the MDL Drug Data Report database show that the optimal value of the smoothing parameter, and hence the predictive power of BKD, is crucially dependent on the number of false positives in the training set. It is also shown that the best results for BKD are achieved using one particular optimization method for the determination of the smoothing parameter that lies at the heart of the method and using the Jaccard/Tanimoto coefficient in the kernel function that is used to compute the similarity between a test set molecule and the members of the training set.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Artificial Intelligence
Data Interpretation, Statistical
Databases as Topic
Drug Design*
Drug Evaluation, Preclinical / methods
Models, Chemical*
Structure-Activity Relationship