Evolutionary interrogation of human biology in well-annotated genomic framework of rhesus macaque

Mol Biol Evol. 2014 May;31(5):1309-24. doi: 10.1093/molbev/msu084. Epub 2014 Feb 27.

Abstract

With genome sequence and composition highly analogous to human, rhesus macaque represents a unique reference for evolutionary studies of human biology. Here, we developed a comprehensive genomic framework of rhesus macaque, the RhesusBase2, for evolutionary interrogation of human genes and the associated regulations. A total of 1,667 next-generation sequencing (NGS) data sets were processed, integrated, and evaluated, generating 51.2 million new functional annotation records. With extensive NGS annotations, RhesusBase2 refined the fine-scale structures in 30% of the macaque Ensembl transcripts, reporting an accurate, up-to-date set of macaque gene models. On the basis of these annotations and accurate macaque gene models, we further developed an NGS-oriented Molecular Evolution Gateway to access and visualize macaque annotations in reference to human orthologous genes and associated regulations (www.rhesusbase.org/molEvo). We highlighted the application of this well-annotated genomic framework in generating hypothetical link of human-biased regulations to human-specific traits, by using mechanistic characterization of the DIEXF gene as an example that provides novel clues to the understanding of digestive system reduction in human evolution. On a global scale, we also identified a catalog of 9,295 human-biased regulatory events, which may represent novel elements that have a substantial impact on shaping human transcriptome and possibly underpin recent human phenotypic evolution. Taken together, we provide an NGS data-driven, information-rich framework that will broadly benefit genomics research in general and serves as an important resource for in-depth evolutionary studies of human biology.

Keywords: RhesusBase; human evolution; human regulation; human-specific trait; next-generation sequencing; rhesus macaque.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases, Nucleic Acid
  • Evolution, Molecular*
  • Gene Expression Profiling
  • Genome, Human
  • Genomics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Macaca mulatta / genetics*
  • Models, Genetic
  • Molecular Sequence Annotation
  • Species Specificity