Evaluation of Machine Learning Classifiers to Predict Compound Mechanism of Action When Transferred across Distinct Cell Lines

Scott J Warchal; John C Dawson; Neil O Carragher

doi:10.1177/2472555218820805

Evaluation of Machine Learning Classifiers to Predict Compound Mechanism of Action When Transferred across Distinct Cell Lines

SLAS Discov. 2019 Mar;24(3):224-233. doi: 10.1177/2472555218820805. Epub 2019 Jan 29.

Authors

Scott J Warchal¹, John C Dawson¹, Neil O Carragher¹

Affiliation

¹ 1 Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, Scotland, UK.

Abstract

Multiparametric high-content imaging assays have become established to classify cell phenotypes from functional genomic and small-molecule library screening assays. Several groups have implemented machine learning classifiers to predict the mechanism of action of phenotypic hit compounds by comparing the similarity of their high-content phenotypic profiles with a reference library of well-annotated compounds. However, the majority of such examples are restricted to a single cell type often selected because of its suitability for simple image analysis and intuitive segmentation of morphological features. The aim of the current study was to evaluate and compare the performance of a classic ensemble-based tree classifier trained on extracted morphological features and a deep learning classifier using convolutional neural networks (CNNs) trained directly on images from the same dataset to predict compound mechanism of action across a morphologically and genetically distinct cell panel. Our results demonstrate that application of a CNN classifier delivers equivalent accuracy compared with an ensemble-based tree classifier at compound mechanism of action prediction within cell lines. However, our CNN analysis performs worse than an ensemble-based tree classifier when trained on multiple cell lines at predicting compound mechanism of action on an unseen cell line.

Keywords: cancer and cancer drugs; cell-based assays; high-content screening; machine learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Cell Line, Tumor
Cytological Techniques / methods
Humans
Machine Learning*
Neural Networks, Computer