An automated algorithm for extracting functional immunologic V-genes from genomes in jawed vertebrates

Immunogenetics. 2013 Sep;65(9):691-702. doi: 10.1007/s00251-013-0715-8. Epub 2013 Jun 22.

Abstract

Variable (V) domains of immunoglobulins (Ig) and T cell receptors (TCR) are generated from genomic V gene segments (V-genes). At present, such V-genes have been annotated only within the genome of a few species. We have developed a bioinformatics tool that accelerates the task of identifying functional V-genes from genome datasets. Automated recognition is accomplished by recognizing key V-gene signatures, such as recombination signal sequences, size of the exon region, and position of amino acid motifs within the translated exon. This algorithm also classifies extracted V-genes into either TCR or Ig loci. We describe the implementation of the algorithm and validate its accuracy by comparing V-genes identified from the human and mouse genomes with known V-gene annotations documented and available in public repositories. The advantages and utility of the algorithm are illustrated by using it to identify functional V-genes in the rat genome, where V-gene annotation is still incomplete. This allowed us to perform a comparative human-rodent phylogenetic analysis based on V-genes that supports the hypothesis that distinct evolutionary pressures shape the TCRs and Igs V-gene repertoires. Our program, together with a user graphical interface, is available as open-source software, downloadable at http://code.google.com/p/vgenextract/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Computational Biology
  • Genes, Immunoglobulin
  • Genetic Variation
  • Genome
  • Humans
  • Immunoglobulin Variable Region / genetics*
  • Mice
  • Phylogeny
  • Rats
  • Receptors, Antigen, T-Cell / genetics*
  • Software

Substances

  • Immunoglobulin Variable Region
  • Receptors, Antigen, T-Cell