NGS-FC: A Next-Generation Sequencing Data Format Converter

IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1683-1691. doi: 10.1109/TCBB.2017.2722442. Epub 2017 Jul 3.

Abstract

With the widespread implementation of next-generation sequencing (NGS) technologies, millions of sequences have been produced. A lot of databases were created to store and organize the high-throughput sequencing data. Numerous analysis software programs and tools have been developed over the past years. Most of them use specific formats for data representation and storage. Data interoperability becomes a crucial challenge and many tools have been developed to convert NGS data from one format to another. However, most of them were developed for specific and limited formats. Here, we present NGS-FC (Next-Generation Sequencing Format Converter), which provides a framework to support the conversion between several formats. It supports 14 formats now and provides interfaces to enable users to improve the existing converters and add new ones. Moreover, NGS-FC achieved the overall competitive performance in comparison with some existing converters in terms of RAM usage and running time. The software is written in Java and can be executed standalone. The source code and documentation are freely available at http://sysbio.suda.edu.cn/NGS-FC.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Databases, Genetic*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Sequence Analysis, DNA / methods*
  • Software*
  • Systems Biology