CONTRAILS: A tool for rapid identification of transgene integration sites in complex, repetitive genomes using low-coverage paired-end sequencing

Genom Data. 2015 Sep 8:6:175-81. doi: 10.1016/j.gdata.2015.09.001. eCollection 2015 Dec.

Abstract

Transgenic crops have become a staple in modern agriculture, and are typically characterized using a variety of molecular techniques involving proteomics and metabolomics. Characterization of the transgene insertion site is of great interest, as disruptions, deletions, and genomic location can affect product selection and fitness, and identification of these regions and their integrity is required for regulatory agencies. Here, we present CONTRAILS (Characterization of Transgene Insertion Locations with Sequencing), a straightforward, rapid and reproducible method for the identification of transgene insertion sites in highly complex and repetitive genomes using low coverage paired-end Illumina sequencing and traditional PCR. This pipeline requires little to no troubleshooting and is not restricted to any genome type, allowing use for many molecular applications. Using whole genome sequencing of in-house transgenic Glycine max, a legume with a highly repetitive and complex genome, we used CONTRAILS to successfully identify the location of a single T-DNA insertion to single base resolution.

Keywords: Agrobacterium; FISH, Fluorescent In-situ Hybridization; IGB, Integrated Genome Browser; Insertion; Junction sequences; NGS,  Next-Generation Sequencing; Next generation sequencing; T-DNA, Transfer DNA; Transfer DNA; Transformation; hTG, human thyroglobulin.