Fine-grained parallelization of fitness functions in bioinformatics optimization problems: gene selection for cancer classification and biclustering of gene expression data

BMC Bioinformatics. 2016 Aug 31;17(1):330. doi: 10.1186/s12859-016-1200-9.

Abstract

Background: Metaheuristics are widely used to solve large combinatorial optimization problems in bioinformatics because of the huge set of possible solutions. Two representative problems are gene selection for cancer classification and biclustering of gene expression data. In most cases, these metaheuristics, as well as other non-linear techniques, apply a fitness function to each possible solution with a size-limited population, and that step involves higher latencies than other parts of the algorithms, which is the reason why the execution time of the applications will mainly depend on the execution time of the fitness function. In addition, it is usual to find floating-point arithmetic formulations for the fitness functions. This way, a careful parallelization of these functions using the reconfigurable hardware technology will accelerate the computation, specially if they are applied in parallel to several solutions of the population.

Results: A fine-grained parallelization of two floating-point fitness functions of different complexities and features involved in biclustering of gene expression data and gene selection for cancer classification allowed for obtaining higher speedups and power-reduced computation with regard to usual microprocessors.

Conclusions: The results show better performances using reconfigurable hardware technology instead of usual microprocessors, in computing time and power consumption terms, not only because of the parallelization of the arithmetic operations, but also thanks to the concurrent fitness evaluation for several individuals of the population in the metaheuristic. This is a good basis for building accelerated and low-energy solutions for intensive computing scenarios.

Keywords: Biclustering; Cancer classification; FPGA; Fitness function; Floating-point arithmetic; Metaheuristics; Parallelism.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Neoplasms / classification
  • Neoplasms / genetics*
  • Neoplasms / pathology
  • Software