Evolutionary hallmarks of the human proteome: chasing the age and coregulation of protein-coding genes

BMC Genomics. 2016 Oct 25;17(Suppl 8):725. doi: 10.1186/s12864-016-3062-y.

Abstract

Background: The development of large-scale technologies for quantitative transcriptomics has enabled comprehensive analysis of the gene expression profiles in complete genomes. RNA-Seq allows the measurement of gene expression levels in a manner far more precise and global than previous methods. Studies using this technology are altering our view about the extent and complexity of the eukaryotic transcriptomes. In this respect, multiple efforts have been done to determine and analyse the gene expression patterns of human cell types in different conditions, either in normal or pathological states. However, until recently, little has been reported about the evolutionary marks present in human protein-coding genes, particularly from the combined perspective of gene expression and protein evolution.

Results: We present a combined analysis of human protein-coding gene expression profiling and time-scale ancestry mapping, that places the genes in taxonomy clades and reveals eight evolutionary major steps ("hallmarks"), that include clusters of functionally coherent proteins. The human expressed genes are analysed using a RNA-Seq dataset of 116 samples from 32 tissues. The evolutionary analysis of the human proteins is performed combining the information from: (i) a database of orthologous proteins (OMA), (ii) the taxonomy mapping of genes to lineage clades (from NCBI Taxonomy) and (iii) the evolution time-scale mapping provided by TimeTree (Timescale of Life). The human protein-coding genes are also placed in a relational context based in the construction of a robust gene coexpression network, that reveals tighter links between age-related protein-coding genes and finds functionally coherent gene modules.

Conclusions: Understanding the relational landscape of the human protein-coding genes is essential for interpreting the functional elements and modules of our active genome. Moreover, decoding the evolutionary history of the human genes can provide very valuable information to reveal or uncover their origin and function.

Keywords: Gene coexpression; Gene house-keeping; Gene tissue-enriched; Human gene evolution; Human protein evolution; Protein families; RNA-seq; Tissue transcriptomics; Transcriptomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Computational Biology / methods
  • Evolution, Molecular*
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Molecular Sequence Annotation
  • Open Reading Frames
  • Organ Specificity / genetics
  • Proteome*
  • Proteomics* / methods
  • Transcriptome

Substances

  • Proteome