GTax: improving de novo transcriptome assembly by removing foreign RNA contamination

Genome Biol. 2024 Jan 8;25(1):12. doi: 10.1186/s13059-023-03141-2.

Abstract

The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts.

MeSH terms

  • Databases, Factual
  • Genomics
  • RNA
  • Sequence Analysis, RNA
  • Solanum lycopersicum* / genetics
  • Transcriptome*

Substances

  • RNA