Next-generation sequencing: big data meets high performance computing

Drug Discov Today. 2017 Apr;22(4):712-717. doi: 10.1016/j.drudis.2017.01.014. Epub 2017 Feb 2.

Abstract

The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern high performance computing systems is required.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Computing Methodologies
  • Databases, Genetic
  • Genome / genetics*
  • Genomics / economics
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing / economics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Sequence Analysis, DNA / economics
  • Sequence Analysis, DNA / methods*