Glycomics and Glycoproteomics

Review
In: Essentials of Glycobiology [Internet]. 4th edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2022. Chapter 51.

Excerpt

The term genomics arose from the availability of complete genome sequence data as well as computational methods for their analysis. However, <2% of genes in the human genome encode proteins. These genes are transcribed into messenger RNAs (mRNAs) that make up the “transcriptome,” of which ∼30% are assigned to protein coding. The total complement of proteins expressed by the cell is collectively termed the “proteome.” Most eukaryotic proteins are post translationally modified (e.g., by phosphorylation, sulfation, oxidation, ubiquitination, acetylation, methylation, lipidation, or glycosylation). These modifications, combined with alternative RNA splicing in eukaryotes, render the proteome considerably more complex than the transcriptome. Although it has been estimated that approximately 120,000 different protein splice forms are expressed by human cells, the total number of modified proteoforms is likely to be at least an order of magnitude higher. The systems-level analysis of all proteins expressed by cells, tissues, or organisms is known as “proteomics.” The proteome, like the transcriptome, but unlike the DNA sequence of the genome, is fundamentally dynamic. The repertoire of proteins expressed by a cell is highly dependent on tissue type, microenvironment, and stage within the life cycle. As cells receive external and internal cues in the form of growth factors, hormones, metabolites, or cell–cell interactions, the expression of various genes is modulated and may be transcribed at levels ranging from silence to more than 104 mRNA copies per cell and 107 protein molecules per cell. Thus, proteomes and their modifications vary during cell differentiation, activation, trafficking, and malignant transformation.

Publication types

  • Review