Deep sequencing of the transcriptomes of soybean aphid and associated endosymbionts

PLoS One. 2012;7(9):e45161. doi: 10.1371/journal.pone.0045161. Epub 2012 Sep 12.

Abstract

Background: The soybean aphid has significantly impacted soybean production in the U.S. Transcriptomic analyses were conducted for further insight into leads for potential novel management strategies.

Methodology/principal findings: Transcriptomic data were generated from whole aphids and from 2,000 aphid guts using an Illumina GAII sequencer. The sequence data were assembled de novo using the Velvet assembler. In addition to providing a general overview, we demonstrate (i) the use of the Multiple-k/Multiple-C method for de novo assembly of short read sequences, followed by BLAST annotation of contigs for increased transcript identification: From 400,000 contigs analyzed, 16,257 non-redundant BLAST hits were identified; (ii) analysis of species distributions of top non-redundant hits: 80% of BLAST hits (minimum e-value of 1.0-E3) were to the pea aphid or other aphid species, representing about half of the pea aphid genes; (iii) comparison of relative depth of sequence coverage to relative transcript abundance for genes with high (membrane alanyl aminopeptidase N) or low transcript abundance; (iv) analysis of the Buchnera transcriptome: Transcripts from 57.6% of the genes from Buchnera aphidicola were identified; (v) identification of Arsenophonus and Wolbachia as potential secondary endosymbionts; (vi) alignment of full length sequences from RNA-seq data for the putative salivary gland protein C002, the silencing of which has potential for aphid management, and the putative Bacillus thuringiensis Cry toxin receptors, aminopeptidase N and alkaline phosphatase.

Conclusions/significance: THIS STUDY PROVIDES THE MOST COMPREHENSIVE DATA SET TO DATE FOR SOYBEAN APHID GENE EXPRESSION: This work also illustrates the utility of short-read transcriptome sequencing and the Multiple-k/Multiple-C method followed by BLAST annotation for rapid identification of target genes for organisms for which reference genome sequences are not available, and extends the utility to include the transcriptomes of endosymbionts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Aphids / genetics*
  • Aphids / microbiology
  • Aphids / physiology
  • Base Sequence
  • Buchnera / genetics
  • Buchnera / physiology
  • Digestive System / metabolism
  • Digestive System / microbiology
  • Enterobacteriaceae / genetics
  • Enterobacteriaceae / physiology
  • Genes, Bacterial / genetics
  • Genes, Insect / genetics
  • Glycine max / parasitology
  • High-Throughput Nucleotide Sequencing / methods*
  • Host-Pathogen Interactions
  • Insect Proteins / classification
  • Insect Proteins / genetics
  • Molecular Sequence Data
  • Phylogeny
  • Pisum sativum / parasitology
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Homology, Amino Acid
  • Sequence Homology, Nucleic Acid
  • Transcriptome*
  • Wolbachia / genetics
  • Wolbachia / physiology

Substances

  • Insect Proteins

Associated data

  • GENBANK/JN135238
  • GENBANK/JN135239
  • GENBANK/JN135240
  • GENBANK/JN135241
  • GENBANK/JN135242
  • GENBANK/JN135243
  • GENBANK/JN135244
  • GENBANK/JN135245

Grants and funding

Funding for this project was provided by the Iowa State University Center for Integrated Animal Genomics http://www.ciag.iastate.edu/ The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.