GaMRed-Adaptive Filtering of High-Throughput Biological Data

IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):149-157. doi: 10.1109/TCBB.2018.2858825. Epub 2018 Jul 23.

Abstract

Data filtering based on removing non-informative features, with unchanged signal between compared experimental conditions, can significantly increase sensitivity of methods used to detect differentially expressed genes or other molecular components measured in high-throughput biological experiments. Criteria for data filtering can be stated on the basis of averages or variances of signal levels across samples. The crucial parts of feature filtering are selection of filter type and cut-off threshold, which are specific to the particular dataset. In this paper, we present an algorithm and a stand-alone application, GaMRed, for adaptive filtering insignificant features in high-throughput data, based on Gaussian mixture decomposition. We have tested the performance of our algorithm using datasets from three different high-throughput biological experiments. We estimated the number of differentially expressed features after applying multiple testing correction and performed functional analysis of obtained features using Gene Ontology terms. Also, we checked if the control of false discovery rate and family-wise error rate after applying feature filtering remains at appropriate level. GaMRed is fast, automatic, and does not require expert knowledge in parameter tuning. The algorithm increases sensitivity of methods used to find differentially expressed features and biological validity of the findings. The program can be downloaded from: http://zaed.aei.polsl.pl/index.php/pl/oprogramowanie-zaed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Computational Biology / methods*
  • Databases, Factual
  • HeLa Cells
  • High-Throughput Screening Assays / methods*
  • Humans
  • Lung / cytology
  • Lung / pathology
  • Mice
  • Neoplasms / genetics
  • Neoplasms / metabolism
  • Sequence Analysis, RNA
  • User-Computer Interface