FDR control of detected regions by multiscale matched filtering

Commun Stat Simul Comput. 2017;46(1):127-144. doi: 10.1080/03610918.2014.957842. Epub 2014 Dec 23.

Abstract

Feature extraction from observed noisy samples is a common important problem in statistics and engineering. This paper presents a novel general statistical approach to the region detection problem in long data sequences. The proposed technique is a multi-scale kernel regression in conjunction with statistical multiple testing for region detection while controlling the false discovery rate (FDR) and maximizing the signal to noise ratio (SNR) via matched filtering. This is achieved by considering a one-dimensional (1D) region detection problem as its equivalent 0D (zero dimensional) peak detection problem. The detection method does not require a priori knowledge of the shape of the non-zero regions. However, if the shape of the non-zero regions is known a priori, e.g. rectangular pulse, the signal regions can also be reconstructed from the detected peaks, seen as their topological point representatives. Simulations show that the method can effectively perform signal detection and reconstruction in the simulated data under high noise conditions, while controlling the FDR of detected regions and their reconstructed length.

Keywords: False discovery rate; Feature extraction; Kernel regression; Local polynomial regression; Matched filtering; Multiple testing; Region detection; Signal reconstruction.