Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness

BMC Bioinformatics. 2015;16 Suppl 18(Suppl 18):S3. doi: 10.1186/1471-2105-16-S18-S3. Epub 2015 Dec 9.

Abstract

Background: Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related.

Results: On benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively.

Conclusions: The proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors.

Availability: http://sourceforge.net/projects/viquas/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Benchmarking
  • High-Throughput Nucleotide Sequencing
  • Internet
  • Metagenomics*
  • User-Computer Interface