APT: An Automated Probe Tracker From Gene Expression Data

IEEE/ACM Trans Comput Biol Bioinform. 2021 Sep-Oct;18(5):1864-1874. doi: 10.1109/TCBB.2019.2958345. Epub 2021 Oct 7.

Abstract

Out of currently available semi-automatic tools for detecting diagnostic probes relevant to a pathophysiological condition, ArrayMining and GEO2R of NCBI are most popular. The shortcomings of ArrayMining and GEO2R are that both tools list the probes ordering them on the basis of their individual statistical level of significances with only difference of statistical methods used by them. While the latest tool GEO2R outputs either top 250 or all genes following its own ranking mechanism, ArrayMining requires number of probes to be inputted by the user. This study provided a way for automatic selection of probe-set that can be obtained from the voting of outputs resulted from statistical methods, t-Test, Mann-Whitney Test and Empirical Bayes Moderated t-test. It was also intriguing to find that the parameters of these statistical methods can be represented as a mathematical function of group fisher's discriminant ratio of a disease-control expression data-pair. Result of this fully automatic method, APT shows 88.97 percent success in comparison to 80.40 and 87.60 percent successes of ArrayMining and GEO2R respectively to include reported probes. Furthermore, out of 10 fold cross validation and 5 new test cases, APT shows a better performance than both ArrayMining and GEO2R in regards to sensitivity and specificity.

MeSH terms

  • Bayes Theorem
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Models, Statistical*
  • Pattern Recognition, Automated / methods