Identification of Compounds That Interfere with High-Throughput Screening Assay Technologies

Laurianne David; Jarrod Walsh; Noé Sturm; Isabella Feierberg; J Willem M Nissink; Hongming Chen; Jürgen Bajorath; Ola Engkvist

doi:10.1002/cmdc.201900395

Identification of Compounds That Interfere with High-Throughput Screening Assay Technologies

ChemMedChem. 2019 Oct 17;14(20):1795-1802. doi: 10.1002/cmdc.201900395. Epub 2019 Sep 19.

Authors

Laurianne David^{1

2}, Jarrod Walsh³, Noé Sturm⁴, Isabella Feierberg⁵, J Willem M Nissink⁶, Hongming Chen¹, Jürgen Bajorath², Ola Engkvist¹

Affiliations

¹ Hit Discovery, Discovery Sciences, R&D BioPharmaceuticals, AstraZeneca Goteborg, Pepparedsleden 1, 431 83, Mölndal, Sweden.
² Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität Bonn, Endenicher Allee 19c, 53115, Bonn, Germany.
³ Hit Discovery, Discovery Sciences, R&D BioPharmaceuticals, AstraZeneca Cambridge, Alderley Park, Macclesfield, SK10 4TG, UK.
⁴ Data Science and AI, Drug Safety & Metabolism, R&D BioPharmaceuticals, AstraZeneca Gothenburg, Pepparedsleden 1, 431 83, Mölndal, Sweden.
⁵ Hit Discovery, Discovery Sciences, R&D BioPharmaceuticals, AstraZeneca Boston, 35 Gatehouse Drive, Waltham, MA, 02451, USA.
⁶ Computational Chemistry, Oncology R&D, AstraZeneca, Cambridge Science Park, Milton Road, Cambridge, CB4 0WG, UK.

Abstract

A significant challenge in high-throughput screening (HTS) campaigns is the identification of assay technology interference compounds. A Compound Interfering with an Assay Technology (CIAT) gives false readouts in many assays. CIATs are often considered viable hits and investigated in follow-up studies, thus impeding research and wasting resources. In this study, we developed a machine-learning (ML) model to predict CIATs for three assay technologies. The model was trained on known CIATs and non-CIATs (NCIATs) identified in artefact assays and described by their 2D structural descriptors. Usual methods identifying CIATs are based on statistical analysis of historical primary screening data and do not consider experimental assays identifying CIATs. Our results show successful prediction of CIATs for existing and novel compounds and provide a complementary and wider set of predicted CIATs compared to BSF, a published structure-independent model, and to the PAINS substructural filters. Our analysis is an example of how well-curated datasets can provide powerful predictive models despite their relatively small size.

Keywords: assay interference; computational chemistry; frequent hitters; high-throughput screening; machine learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Databases, Factual
High-Throughput Screening Assays*
Machine Learning
Models, Molecular
Molecular Structure
Organic Chemicals / chemistry*
Particle Size

Substances

Organic Chemicals

Grants and funding

676434/H2020 Marie Skłodowska-Curie Actions/International