From complex data to biological insight: 'DEKER' feature selection and network inference

Sean M S Hayes; Jeffrey R Sachs; Carolyn R Cho

doi:10.1007/s10928-021-09792-7

From complex data to biological insight: 'DEKER' feature selection and network inference

J Pharmacokinet Pharmacodyn. 2022 Feb;49(1):81-99. doi: 10.1007/s10928-021-09792-7. Epub 2021 Nov 17.

Authors

Sean M S Hayes¹, Jeffrey R Sachs², Carolyn R Cho²

Affiliations

¹ Quantitative Pharmacology and Pharmacometrics, Merck & Co., Inc., Kenilworth, NJ, USA. sean.hayes@merck.com.
² Quantitative Pharmacology and Pharmacometrics, Merck & Co., Inc., Kenilworth, NJ, USA.

Abstract

Network inference is a valuable approach for gaining mechanistic insight from high-dimensional biological data. Existing methods for network inference focus on ranking all possible relations (edges) among all measured quantities such as genes, proteins, metabolites (features) observed, which yields a dense network that is challenging to interpret. Identifying a sparse, interpretable network using these methods thus requires an error-prone thresholding step which compromises their performance. In this article we propose a new method, DEKER-NET, that addresses this limitation by directly identifying a sparse, interpretable network without thresholding, improving real-world performance. DEKER-NET uses a novel machine learning method for feature selection in an iterative framework for network inference. DEKER-NET is extremely flexible, handling linear and nonlinear relations while making no assumptions about the underlying distribution of data, and is suitable for categorical or continuous variables. We test our method on the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge data, demonstrating that it can directly identify sparse, interpretable networks without thresholding while maintaining performance comparable to the hypothetical best-case thresholded network of other methods.

Keywords: Feature selection; Machine learning; Multiomics; Network inference; Systems biology.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Gene Regulatory Networks*
Machine Learning
Proteins

Substances

Proteins