Whole-genome sequence of the oriental lung fluke Paragonimus westermani

Gigascience. 2019 Jan 1;8(1):giy146. doi: 10.1093/gigascience/giy146.

Abstract

Background: Foodborne infections caused by lung flukes of the genus Paragonimus are a significant and widespread public health problem in tropical areas. Approximately 50 Paragonimus species have been reported to infect animals and humans, but Paragonimus westermani is responsible for the bulk of human disease. Despite their medical and economic importance, no genome sequence for any Paragonimus species is available.

Results: We sequenced and assembled the genome of P. westermani, which is among the largest of the known pathogen genomes with an estimated size of 1.1 Gb. A 922.8 Mb genome assembly was generated from Illumina and Pacific Biosciences (PacBio) sequence data, covering 84% of the estimated genome size. The genome has a high proportion (45%) of repeat-derived DNA, particularly of the long interspersed element and long terminal repeat subtypes, and the expansion of these elements may explain some of the large size. We predicted 12,852 protein coding genes, showing a high level of conservation with related trematode species. The majority of proteins (80%) had homologs in the human liver fluke Opisthorchis viverrini, with an average sequence identity of 64.1%. Assembly of the P. westermani mitochondrial genome from long PacBio reads resulted in a single high-quality circularized 20.6 kb contig. The contig harbored a 6.9 kb region of non-coding repetitive DNA comprised of three distinct repeat units. Our results suggest that the region is highly polymorphic in P. westermani, possibly even within single worm isolates.

Conclusions: The generated assembly represents the first Paragonimus genome sequence and will facilitate future molecular studies of this important, but neglected, parasite group.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Genome Size
  • Genome, Helminth*
  • Genome, Mitochondrial
  • High-Throughput Nucleotide Sequencing
  • Molecular Sequence Annotation
  • Paragonimus westermani / genetics*
  • Phylogeny
  • Sequence Homology, Nucleic Acid
  • Whole Genome Sequencing / methods*