Deimos: A novel automated methodology for optimal grouping. Application to nanoinformatics case studies

Mol Inform. 2023 Aug;42(8-9):e2300019. doi: 10.1002/minf.202300019. Epub 2023 Aug 21.

Abstract

In this study we present deimos, a computational methodology for optimal grouping, applied on the read-across prediction of engineered nanomaterials' (ENMs) toxicity-related properties. The method is based on the formulation and the solution of a mixed-integer optimization program (MILP) problem that automatically and simultaneously performs feature selection, defines the grouping boundaries according to the response variable and develops linear regression models in each group. For each group/region, the characteristic centroid is defined in order to allocate untested ENMs to the groups. The deimos MILP problem is integrated in a broader optimization workflow that selects the best performing methodology between the standard multiple linear regression (MLR), the least absolute shrinkage and selection operator (LASSO) models and the proposed deimos multiple-region model. The performance of the suggested methodology is demonstrated through the application to benchmark ENMs datasets and comparison with other predictive modelling approaches. However, the proposed method can be applied to property prediction of other than ENM chemical entities and it is not limited to ENMs toxicity prediction.

Keywords: grouping; mathematical optimization; nanoinformatics; predictive modelling; read-across.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking
  • Linear Models
  • Nanostructures* / chemistry