Genomic and geographical structure of human cytomegalovirus

Proc Natl Acad Sci U S A. 2023 Jul 25;120(30):e2221797120. doi: 10.1073/pnas.2221797120. Epub 2023 Jul 17.

Abstract

Human cytomegalovirus (CMV) has infected humans since the origin of our species and currently infects most of the world's population. Variability between CMV genomes is the highest of any human herpesvirus, yet large portions of the genome are conserved. Here, we show that the genome encodes 74 regions of relatively high variability each with 2 to 8 alleles. We then identified two patterns in the CMV genome. Conserved parts of the genome and a minority (32) of variable regions show geographic population structure with evidence for African or European clustering, although hybrid strains are present. We find no evidence that geographic segregation has been driven by host immune pressure affecting known antigenic sites. Forty-two variable regions show no geographical structure, with similar allele distributions across different continental populations. These "nongeographical" regions are significantly enriched for genes encoding immunomodulatory functions suggesting a core functional importance. We hypothesize that at least two CMV founder populations account for the geographical differences that are largely seen in the conserved portions of the genome, although the timing of separation and direction of spread between the two are not clear. In contrast, the similar allele frequencies among 42 variable regions of the genome, irrespective of geographical origin, are indicative of a second evolutionary process, namely balancing selection that may preserve properties critical to CMV biological function. Given that genetic differences between CMVs are postulated to alter immunogenicity and potentially function, understanding these two evolutionary processes could contribute important information for the development of globally effective vaccines and the identification of novel drug targets.

Keywords: genomics; genotyping; hidden Markov models; human cytomegalovirus; hypervariability.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cytomegalovirus Infections*
  • Cytomegalovirus* / genetics
  • Gene Frequency
  • Genomics
  • Humans