Combination of de novo assembly of massive sequencing reads with classical repeat prediction improves identification of repetitive sequences in Schistosoma mansoni

Exp Parasitol. 2012 Apr;130(4):470-4. doi: 10.1016/j.exppara.2012.02.010. Epub 2012 Feb 21.

Abstract

The genome of the parasitic platyhelminth Schistosoma mansoni is composed of approximately 40% of repetitive sequences of which roughly 20% correspond to transposable elements. When the genome sequence became available, conventional repeat prediction programs were used to find these repeats, but only a fraction could be identified. To exhaustively characterize the repeats we applied a new massive sequencing based strategy: we re-sequenced the genome by next generation sequencing, aligned the sequencing reads to the genome and assembled all multiple-hit reads into contigs corresponding to the repetitive part of the genome. We present here, for the first time, this de novo repeat assembly strategy and we confirm that such assembly is feasible. We identified and annotated 4,143 new repeats in the S. mansoni genome. At least one third of the repeats are transcribed. This strategy allowed us also to identify 14 new microsatellite markers, which can be used for pedigree studies. Annotations and the combined (previously known and new) 5,420 repeat sequences (corresponding to 47% of the genome) are available for download (http://methdb.univ-perp.fr/downloads/).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Biomphalaria
  • DNA, Complementary / chemistry
  • DNA, Helminth / chemistry
  • DNA, Ribosomal / chemistry
  • RNA, Helminth / genetics
  • RNA, Helminth / isolation & purification
  • RNA, Ribosomal, 28S / genetics
  • Repetitive Sequences, Nucleic Acid / physiology*
  • Schistosoma mansoni / genetics*
  • Sequence Alignment / methods
  • Sequence Analysis / methods
  • Transcription, Genetic / physiology

Substances

  • DNA, Complementary
  • DNA, Helminth
  • DNA, Ribosomal
  • RNA, Helminth
  • RNA, Ribosomal, 28S