Aggregating Knockoffs for False Discovery Rate Control with an Application to Gut Microbiome Data

Entropy (Basel). 2021 Feb 16;23(2):230. doi: 10.3390/e23020230.

Abstract

Recent discoveries suggest that our gut microbiome plays an important role in our health and wellbeing. However, the gut microbiome data are intricate; for example, the microbial diversity in the gut makes the data high-dimensional. While there are dedicated high-dimensional methods, such as the lasso estimator, they always come with the risk of false discoveries. Knockoffs are a recent approach to control the number of false discoveries. In this paper, we show that knockoffs can be aggregated to increase power while retaining sharp control over the false discoveries. We support our method both in theory and simulations, and we show that it can lead to new discoveries on microbiome data from the American Gut Project. In particular, our results indicate that several phyla that have been overlooked so far are associated with obesity.

Keywords: false discovery rate control; gut microbiome; knockoffs; variable selection.