Functional annotation of the human chromosome 7 "missing" proteins: a bioinformatics approach

J Proteome Res. 2013 Jun 7;12(6):2504-10. doi: 10.1021/pr301082p. Epub 2013 Jan 11.

Abstract

The chromosome-centric human proteome project aims to systematically map all human proteins, chromosome by chromosome, in a gene-centric manner through dedicated efforts from national and international teams. This mapping will lead to a knowledge-based resource defining the full set of proteins encoded in each chromosome and laying the foundation for the development of a standardized approach to analyze the massive proteomic data sets currently being generated. The neXtProt database lists 946 proteins as the human proteome of chromosome 7. However, 170 (18%) proteins of human chromosome 7 have no evidence at the proteomic, antibody, or structural levels and are considered "missing" in this study as they lack experimental support. We have developed a protocol for the functional annotation of these "missing" proteins by integrating several bioinformatics analysis and annotation tools, sequential BLAST homology searches, protein domain/motif and gene ontology (GO) mapping, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Using the BLAST search strategy, homologues for reviewed non-human mammalian proteins with protein evidence were identified for 90 "missing" proteins while another 38 had reviewed non-human mammalian homologues. Putative functional annotations were assigned to 27 of the remaining 43 novel proteins. Proteotypic peptides have been computationally generated to facilitate rapid identification of these proteins. Four of the "missing" chromosome 7 proteins have been substantiated by the ENCODE proteogenomic peptide data.

MeSH terms

  • Algorithms
  • Animals
  • Chromosome Mapping
  • Chromosomes, Human, Pair 7*
  • Databases, Protein
  • Genome, Human*
  • Human Genome Project*
  • Humans
  • Mammals / genetics*
  • Mammals / metabolism
  • Molecular Sequence Annotation*
  • Proteome / genetics*
  • Proteome / metabolism
  • Sequence Homology, Amino Acid

Substances

  • Proteome