Updated understanding of the protein-DNA recognition code used by C2H2 zinc finger proteins

Curr Opin Struct Biol. 2024 May 15:87:102836. doi: 10.1016/j.sbi.2024.102836. Online ahead of print.

Abstract

C2H2 zinc-finger (ZF) proteins form the largest family of DNA-binding transcription factors coded by mammalian genomes. In a typical DNA-binding ZF module, there are twelve residues (numbered from -1 to -12) between the last zinc-coordinating cysteine and the first zinc-coordinating histidine. The established C2H2-ZF "recognition code" suggests that residues at positions -1, -4, and -7 recognize the 5', central, and 3' bases of a DNA base-pair triplet, respectively. Structural studies have highlighted that additional residues at positions -5 and -8 also play roles in specific DNA recognition. The presence of bulky and either charged or polar residues at these five positions determines specificity for given DNA bases: guanine is recognized by arginine, lysine, or histidine; adenine by asparagine or glutamine; thymine or 5-methylcytosine by glutamate; and unmodified cytosine by aspartate. This review discusses recent structural characterizations of C2H2-ZFs that add to our understanding of the principles underlying the C2H2-ZF recognition code.

Keywords: C2H2 zinc fingers; DNA sequence-specific recognition; protein-DNA interactions; transcription factors.

Publication types

  • Review