GPU-Acceleration of Sequence Homology Searches with Database Subsequence Clustering

PLoS One. 2016 Aug 2;11(8):e0157338. doi: 10.1371/journal.pone.0157338. eCollection 2016.

Abstract

Sequence homology searches are used in various fields and require large amounts of computation time, especially for metagenomic analysis, owing to the large number of queries and the database size. To accelerate computing analyses, graphics processing units (GPUs) are widely used as a low-cost, high-performance computing platform. Therefore, we mapped the time-consuming steps involved in GHOSTZ, which is a state-of-the-art homology search algorithm for protein sequences, onto a GPU and implemented it as GHOSTZ-GPU. In addition, we optimized memory access for GPU calculations and for communication between the CPU and GPU. As per results of the evaluation test involving metagenomic data, GHOSTZ-GPU with 12 CPU threads and 1 GPU was approximately 3.0- to 4.1-fold faster than GHOSTZ with 12 CPU threads. Moreover, GHOSTZ-GPU with 12 CPU threads and 3 GPUs was approximately 5.8- to 7.7-fold faster than GHOSTZ with 12 CPU threads.

MeSH terms

  • Algorithms
  • Animals
  • Cluster Analysis
  • Computer Graphics
  • Computing Methodologies*
  • DNA / chemistry
  • Databases, Genetic
  • Humans
  • Metagenomics / economics
  • Metagenomics / methods*
  • Proteins / chemistry
  • Sequence Homology*
  • Software*
  • Time Factors

Substances

  • Proteins
  • DNA

Grants and funding

This work was supported by a Grant-in-Aid for the Japan Society for the Promotion of Science Fellows (Grant number 248766) to SS, the Strategic Programs for Innovative Research Field 1 Supercomputational Life Science of the Ministry of Education, Culture, Sports, Science and Technology of Japan to YA, and Cancer Research Development funding from the National Cancer Center of Japan to YA.