Proposed method for dimensionality reduction based on framework in gene expression domain

Genet Mol Res. 2014 Dec 12;13(4):10582-91. doi: 10.4238/2014.December.12.21.

Abstract

The excessive use of attributes may affect the search for patterns and extraction of useful knowledge, because they harm the learning performance of algorithms in both speed and success rate. The use of dimensionality reduction methods is therefore an important alternative; however, these methods do not deal with the reduction of attributes in a specific area. This article presents a method based on framework concepts of domain for reducing attributes in a domain. The input method is a set of databases related to a domain, and the main process is the identification of common and variable attributes, plus the reduction of attributes in the original database. The proposed method was applied in the gene expression domain, using databases. The method can be used to analyze the most relevant attributes in a specific domain, granting greater confidence for models created for the application of a data mining task, thus, a previously known method in data mining. Attribute selection was also applied in the three databases for the comparison of the results. Analyses of the results using the criterion of cross-validation revealed that the employment of the methods resulted in the improvement of success rates compared to the databases containing the full range of attributes.

MeSH terms

  • Algorithms*
  • Data Mining*
  • Databases, Genetic*
  • Gene Expression Profiling
  • Humans
  • Oligonucleotide Array Sequence Analysis / methods