Binpairs: utilization of Illumina paired-end information for improving efficiency of taxonomic binning of metagenomic sequences

PLoS One. 2014 Dec 31;9(12):e114814. doi: 10.1371/journal.pone.0114814. eCollection 2014.

Abstract

Motivation: Paired-end sequencing protocols, offered by next generation sequencing (NGS) platforms like Illumia, generate a pair of reads for every DNA fragment in a sample. Although this protocol has been utilized for several metagenomics studies, most taxonomic binning approaches classify each of the reads (forming a pair), independently. The present work explores some simple but effective strategies of utilizing pairing-information of Illumina short reads for improving the accuracy of taxonomic binning of metagenomic datasets. The strategies proposed can be used in conjunction with all genres of existing binning methods.

Results: Validation results suggest that employment of these "Binpairs" strategies can provide significant improvements in the binning outcome. The quality of the taxonomic assignments thus obtained are often comparable to those that can only be achieved with relatively longer reads obtained using other NGS platforms (such as Roche).

Availability: An implementation of the proposed strategies of utilizing pairing information is freely available for academic users at https://metagenomics.atc.tcs.com/binning/binpairs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Classification / methods*
  • Computer Simulation
  • High-Throughput Nucleotide Sequencing / methods*
  • Metagenomics*
  • Reproducibility of Results
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Statistics as Topic / methods*

Grants and funding

The authors have no support or funding to report.