Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor

Genome Biol. 2023 Nov 13;24(1):260. doi: 10.1186/s13059-023-03102-9.

Abstract

Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3' fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem.

Keywords: Annotation; DNA transposon; GTF2; Genome; Transcription factor; Transposable element.

MeSH terms

  • Animals
  • DNA Transposable Elements*
  • Molecular Sequence Annotation
  • Phylogeny
  • Transcription Factors, General*
  • Vertebrates / genetics

Substances

  • DNA Transposable Elements
  • Transcription Factors, General