Identification of Pseudogenes in Brachypodium distachyon Chromosomes

Methods Mol Biol. 2018:1667:149-171. doi: 10.1007/978-1-4939-7278-4_12.

Abstract

Pseudogenes are gene copies that have lost the capability to encode a functional protein. Based on their structure, pseudogenes are classified in two types. Processed pseudogenes arise by a process of retrotranscription from a spliced mRNA and subsequent integration into the genome. Nonprocessed (or duplicated) pseudogenes are generated by genomic duplication and subsequent mutations that disable their functionality so that they cannot longer encode a functional protein. Differently from processed pseudogenes, duplicated pseudogenes are expected to conserve the exon-intron structure of their functional paralogs.Here, we describe a computational pipeline for identifying pseudogenes of both types in B. distachyon chromosomes. Our pipeline (1) identifies pseudogenes based on tBLASTn searches of B. distachyon proteins against the noncoding genomic complement of the same species, (2) identifies the most homologous pseudogenes functionally paralogous as the pseudogene paternal locus, (3) uses the intron-exon structure of paternal genes to distinguish between pseudogene types.The pipeline is presented in its composing steps and tested on the Brachypodium distachyon Bd1 scaffold.

Keywords: Disabling mutations; Duplicated pseudogenes; Intron–exon structure; Paralogous genes; Processed pseudogenes.

MeSH terms

  • Brachypodium / genetics*
  • Chromosomes, Plant / genetics*
  • Exons
  • Genome, Plant
  • Genomics / methods*
  • Introns
  • Plant Proteins / genetics
  • Pseudogenes*
  • Software

Substances

  • Plant Proteins