Evolution of Endogenous Retroviruses in the Subfamily of Caprinae

Viruses. 2024 Mar 4;16(3):398. doi: 10.3390/v16030398.

Abstract

The interest in endogenous retroviruses (ERVs) has been fueled by their impact on the evolution of the host genome. In this study, we used multiple pipelines to conduct a de novo exploration and annotation of ERVs in 13 species of the Caprinae subfamily. Through analyses of sequence identity, structural organization, and phylogeny, we defined 28 ERV groups within Caprinae, including 19 gamma retrovirus groups and 9 beta retrovirus groups. Notably, we identified four recent and potentially active groups prevalent in the Caprinae genomes. Additionally, our investigation revealed that most long noncoding genes (lncRNA) and protein-coding genes (PC) contain ERV-derived sequences. Specifically, we observed that ERV-derived sequences were present in approximately 75% of protein-coding genes and 81% of lncRNA genes in sheep. Similarly, in goats, ERV-derived sequences were found in approximately 74% of protein-coding genes and 75% of lncRNA genes. Our findings lead to the conclusion that the majority of ERVs in the Caprinae genomes can be categorized as fossils, representing remnants of past retroviral infections that have become permanently integrated into the genomes. Nevertheless, the identification of the Cap_ERV_20, Cap_ERV_21, Cap_ERV_24, and Cap_ERV_25 groups indicates the presence of relatively recent and potentially active ERVs in these genomes. These particular groups may contribute to the ongoing evolution of the Caprinae genome. The identification of putatively active ERVs in the Caprinae genomes raises the possibility of harnessing them for future genetic marker development.

Keywords: Caprinae; ERVs; divergence pattern; gene overlapping; genome annotation.

MeSH terms

  • Animals
  • Endogenous Retroviruses* / genetics
  • Evolution, Molecular
  • Phylogeny
  • RNA, Long Noncoding* / genetics
  • Retroviridae Infections*
  • Sheep

Substances

  • RNA, Long Noncoding