Chromosome-level de novo assembly of the pig-tailed macaque genome using linked-read sequencing and HiC proximity scaffolding

Gigascience. 2020 Jul 1;9(7):giaa069. doi: 10.1093/gigascience/giaa069.

Abstract

Background: Macaque species share >93% genome homology with humans and develop many disease phenotypes similar to those of humans, making them valuable animal models for the study of human diseases (e.g., HIV and neurodegenerative diseases). However, the quality of genome assembly and annotation for several macaque species lags behind the human genome effort.

Results: To close this gap and enhance functional genomics approaches, we used a combination of de novo linked-read assembly and scaffolding using proximity ligation assay (HiC) to assemble the pig-tailed macaque (Macaca nemestrina) genome. This combinatorial method yielded large scaffolds at chromosome level with a scaffold N50 of 127.5 Mb; the 23 largest scaffolds covered 90% of the entire genome. This assembly revealed large-scale rearrangements between pig-tailed macaque chromosomes 7, 12, and 13 and human chromosomes 2, 14, and 15. We subsequently annotated the genome using transcriptome and proteomics data from personalized induced pluripotent stem cells derived from the same animal. Reconstruction of the evolutionary tree using whole-genome annotation and orthologous comparisons among 3 macaque species, human, and mouse genomes revealed extensive homology between human and pig-tailed macaques with regards to both pluripotent stem cell genes and innate immune gene pathways. Our results confirm that rhesus and cynomolgus macaques exhibit a closer evolutionary distance to each other than either species exhibits to humans or pig-tailed macaques.

Conclusions: These findings demonstrate that pig-tailed macaques can serve as an excellent animal model for the study of many human diseases particularly with regards to pluripotency and innate immune pathways.

Keywords: HiC; chromosome-level assembly; linked-read; nonhuman primate; pig-tailed macaque.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Chromosomes*
  • Computational Biology / methods
  • Genome*
  • Genomics* / methods
  • Humans
  • Karyotyping / methods
  • Macaca nemestrina / genetics*
  • Male
  • Molecular Sequence Annotation
  • Proteomics / methods
  • Repetitive Sequences, Nucleic Acid