mmquant: how to count multi-mapping reads?

BMC Bioinformatics. 2017 Sep 15;18(1):411. doi: 10.1186/s12859-017-1816-4.

Abstract

Background: RNA-Seq is currently used routinely, and it provides accurate information on gene transcription. However, the method cannot accurately estimate duplicated genes expression. Several strategies have been previously used (drop duplicated genes, distribute uniformly the reads, or estimate expression), but all of them provide biased results.

Results: We provide here a tool, called mmquant, for computing gene expression, included duplicated genes. If a read maps at different positions, the tool detects that the corresponding genes are duplicated; it merges the genes and creates a merged gene. The counts of ambiguous reads is then based on the input genes and the merged genes.

Conclusion: mmquant is a drop-in replacement of the widely used tools htseq-count and featureCounts that handles multi-mapping reads in an unabiased way.

Keywords: Multi-mapping reads; Quantification; RNA-Seq.

MeSH terms

  • Computational Biology
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation*
  • Sequence Analysis, RNA
  • Software*