Shrinking of repeating unit length in leucine-rich repeats from double-stranded DNA viruses

Arch Virol. 2021 Jan;166(1):43-64. doi: 10.1007/s00705-020-04820-2. Epub 2020 Oct 14.

Abstract

Leucine-rich repeats (LRRs) are present in over 563,000 proteins from viruses to eukaryotes. LRRs repeat in tandem and have been classified into fifteen classes in which the repeat unit lengths range from 20 to 29 residues. Most LRR proteins are involved in protein-protein or ligand interactions. The amount of genome sequence data from viruses is increasing rapidly, and although viral LRR proteins have been identified, a comprehensive sequence analysis has not yet been done, and their structures, functions, and evolution are still unknown. In the present study, we characterized viral LRRs by sequence analysis and identified over 600 LRR proteins from 89 virus species. Most of these proteins were from double-stranded DNA (dsDNA) viruses, including nucleocytoplasmic large dsDNA viruses (NCLDVs). We found that the repeating unit lengths of 11 types are one to five residues shorter than those of the seven known corresponding LRR classes. The repeating units of six types are 19 residues long and are thus the shortest among all LRRs. In addition, two of the LRR types are unique and have not been observed in bacteria, archae or eukaryotes. Conserved strongly hydrophobic residues such as Leu, Val or Ile in the consensus sequences are replaced by Cys with high frequency. Phylogenetic analysis indicated that horizontal gene transfer of some viral LRR genes had occurred between the virus and its host. We suggest that the shortening might contribute to the survival strategy of viruses. The present findings provide a new perspective on the origin and evolution of LRRs.

MeSH terms

  • Archaea / virology
  • Bacteria / virology
  • Consensus Sequence / genetics
  • DNA / genetics*
  • Eukaryota / virology
  • Leucine / genetics*
  • Phylogeny
  • Repetitive Sequences, Amino Acid / genetics*
  • Viral Proteins / genetics
  • Viruses / genetics*

Substances

  • Viral Proteins
  • DNA
  • Leucine