Fragment merger: an online tool to merge overlapping long sequence fragments

Viruses. 2013 Mar 12;5(3):824-33. doi: 10.3390/v5030824.

Abstract

While PCR amplicons extend to a few thousand bases, the length of sequences from direct Sanger sequencing is limited to 500-800 nucleotides. Therefore, several fragments may be required to cover an amplicon, a gene or an entire genome. These fragments are typically sequenced in an overlapping fashion and assembled by manually sliding and aligning the sequences visually. This is time-consuming, repetitive and error-prone, and further complicated by circular genomes. An online tool merging two to twelve long overlapping sequence fragments was developed. Either chromatograms or FASTA files are submitted to the tool, which trims poor quality ends of chromatograms according to user-specified parameters. Fragments are assembled into a single sequence by repeatedly calling the EMBOSS merger tool in a consecutive manner. Output includes the number of trimmed nucleotides, details of each merge, and an optional alignment to a reference sequence. The final merge sequence is displayed and can be downloaded in FASTA format. All output files can be downloaded as a ZIP archive. This tool allows for easy and automated assembly of overlapping sequences and is aimed at researchers without specialist computer skills. The tool is genome- and organism-agnostic and has been developed using hepatitis B virus sequence data.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Nucleic Acid
  • Hepatitis B virus / chemistry
  • Hepatitis B virus / genetics*
  • Internet
  • Online Systems / instrumentation*
  • Sequence Alignment / instrumentation*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / instrumentation
  • Sequence Analysis, DNA / methods*
  • Software