Aggregated Conformal Prediction is used as an effective alternative to other, more complicated and/or ambiguous methods involving various balancing measures when modelling severely imbalanced datasets. Additional explicit balancing measures other than those already apart of the Conformal Prediction framework are shown not to be required. The Aggregated Conformal Prediction procedure appears to be a promising approach for severely imbalanced datasets in order to retrieve a large majority of active minority class compounds while avoiding information loss or distortion.
Keywords: Aggregated conformal prediction; Imbalanced datasets; QSAR; Signature descriptors; Support vector machines.
Copyright © 2017 Elsevier Inc. All rights reserved.