Survey of gene splicing algorithms based on reads

Bioengineered. 2017 Nov 2;8(6):750-758. doi: 10.1080/21655979.2017.1373538. Epub 2017 Sep 21.

Abstract

Gene splicing is the process of assembling a large number of unordered short sequence fragments to the original genome sequence as accurately as possible. Several popular splicing algorithms based on reads are reviewed in this article, including reference genome algorithms and de novo splicing algorithms (Greedy-extension, Overlap-Layout-Consensus graph, De Bruijn graph). We also discuss a new splicing method based on the MapReduce strategy and Hadoop. By comparing these algorithms, some conclusions are drawn and some suggestions on gene splicing research are made.

Keywords: De Bruijn graph; Hadoop; MapReduce; gene splicing; read.

MeSH terms

  • Algorithms*
  • Genome, Bacterial
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA / methods*
  • Software