Predicting skin sensitizers with confidence - Using conformal prediction to determine applicability domain of GARD

Toxicol In Vitro. 2018 Apr:48:179-187. doi: 10.1016/j.tiv.2018.01.021. Epub 2018 Jan 31.

Abstract

GARD - Genomic Allergen Rapid Detection is a cell based alternative to animal testing for identification of skin sensitizers. The assay is based on a biomarker signature comprising 200 genes measured in an in vitro model of dendritic cells following chemical stimulations, and consistently reports predictive performances ~90% for classification of external test sets. Within the field of in vitro skin sensitization testing, definition of applicability domain is often neglected by test developers, and assays are often considered applicable across the entire chemical space. This study complements previous assessments of model performance with an estimate of confidence in individual classifications, as well as a statistically valid determination of the applicability domain for the GARD assay. Conformal prediction was implemented into current GARD protocols, and a large external test dataset (n = 70) was classified at a confidence level of 85%, to generate a valid model with a balanced accuracy of 88%, with none of the tested chemical reactivity domains identified as outside the applicability domain of the assay. In conclusion, results presented in this study complement previously reported predictive performances of GARD with a statistically valid assessment of uncertainty in each individual prediction, thus allowing for classification of skin sensitizers with confidence.

Keywords: Applicability domain; Conformal prediction; GARD; In vitro assay; Skin sensitization.

Publication types

  • Validation Study

MeSH terms

  • Algorithms
  • Allergens / toxicity*
  • Animal Testing Alternatives*
  • Biomarkers
  • Dermatitis, Allergic Contact / genetics*
  • Dermatitis, Allergic Contact / pathology*
  • Gene Expression Profiling
  • Genomics / methods*
  • Humans
  • Machine Learning
  • Models, Theoretical
  • Predictive Value of Tests
  • Reproducibility of Results
  • Software

Substances

  • Allergens
  • Biomarkers