Analysis of the transgene insertion pattern in a transgenic mouse strain using long-read sequencing

Exp Anim. 2020 Aug 5;69(3):279-286. doi: 10.1538/expanim.19-0118. Epub 2020 Feb 11.

Abstract

Transgene insertion patterns are critical for the analysis of transgenic animals because the influence of transgenes may change depending on the insertion pattern (such as copy numbers and orientations of concatenations) and the insertion position in the genome. We previously reported a genomic walking strategy to locate transgenes in the genomes of transgenic mice (Exp. Anim. 53: 103-111, 2004) and to analyze transgene insertion patterns (Exp. Anim. 55: 65-69, 2006). With such strategies, however, we could not determine the copy number of transgenes or global genome modification induced by transgene insertion due to read-length limitation. In this study, we used a long-read sequencer (MinION, Oxford Nanopore Technologies) to overcome this limitation. We obtained 922,210 reads using MinION with genomic DNA from a transgenic mouse strain (4C30, Proc. Jpn. Acad. Ser. B. Phys. Biol. Sci. 87: 550-562, 2011). Among the reads, we found one 21,457-bp read containing the transgene using a local BLAST search. Nucleotide dot plot analysis revealed that the transgene was inserted in the genome as a tandem concatemer with an almost entire construct (15-3,508 of 3,508 bp) and a partial fragment (4-660, 657 bp). Ensembl's BLAST search against the C57BL/6N genome revealed a 9,388-bp deletion at the insertion position in the intron of the Sgcd gene, confirming that mutations such as a large genomic deletion could occur at the time of transgene insertion. Thus, long-read sequencers are useful tools for the analysis of transgene insertion patterns.

Keywords: long-read sequencer; mice; nanopore; transgene insertion pattern.

MeSH terms

  • Animals
  • Genome / genetics
  • Mice, Inbred C57BL
  • Mice, Transgenic
  • Mutagenesis, Insertional*
  • Mutation
  • Sarcoglycans / genetics
  • Sequence Analysis, DNA / methods*
  • Transgenes / genetics*

Substances

  • Sarcoglycans
  • Sgcd protein, mouse