Mapping the protein domain structures of the respiratory mucins: a mucin proteome coverage study

J Proteome Res. 2012 Aug 3;11(8):4013-23. doi: 10.1021/pr300058z. Epub 2012 Jun 28.

Abstract

Mucin genes encode a family of the largest expressed proteins in the human genome. The proteins are highly substituted with O-linked oligosaccharides that greatly restrict access to the peptide backbones. The genomic organization of the N-terminal, O-glycosylated, and C-terminal regions of most of the mucins has been established and is available in the sequence databases. However, much less is known about the fate of their exposed protein regions after translation and secretion, and to date, detailed proteomic studies complementary to the genomic studies are rather limited. Using mucins isolated from cultured human airway epithelial cell secretions, trypsin digestion, and mass spectrometry, we investigated the proteome coverage of the mucins responsible for the maintenance and protection of the airway epithelia. Excluding the heavily glycosylated mucin domains, up to 85% coverage of the N-terminal region of the gel-forming mucins MUC5B and MUC5AC was achieved, and up to 60% of the C-terminal regions were covered, suggesting that more N- and sparsely O-glycosylated regions as well as possible other modifications are available at the C-terminus. All possible peptides from the cysteine-rich regions that interrupt the heavily glycosylated mucin domains were identified. Interestingly, 43 cleavage sites from 10 different domains of MUC5B and MUC5AC were identified, which possessed a non-tryptic cleavage site on the N-terminal end of the peptide, indicating potential exposure to proteolytic and/or "spontaneous cleavages". Some of these non-tryptic cleavages may be important for proper maturation of the molecule, before and/or after secretion. Most of the peptides identified from MUC16 were from the SEA region. Surprisingly, three peptides were clearly identified from its heavily glycosylated regions. Up to 25% coverage of MUC4 was achieved covering seven different domains of the molecule. All peptides from the MUC1 cytoplasmic domain were detected along with the three non-tryptic cleavages in the region. Only one peptide was identified from MUC20, which led us to successful antisera raised against the molecule. Taken together, this report represents our current efforts to dissect the complexities of mucin macromolecules. Identification of regions accessible to proteolysis can help in the design of effective antibodies and points to regions that might be available for mucin-protein interactions and identification of cleavage sites will enable understanding of their pre- and post-secretory processing in normal and disease environments.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Cell Line
  • Epithelial Cells / metabolism
  • Humans
  • Molecular Sequence Data
  • Mucins / chemistry*
  • Peptide Fragments / chemistry
  • Peptide Mapping
  • Protein Structure, Tertiary
  • Proteolysis
  • Proteome / metabolism*
  • Respiratory Mucosa / cytology
  • Respiratory Mucosa / metabolism
  • Trypsin / chemistry

Substances

  • Mucins
  • Peptide Fragments
  • Proteome
  • Trypsin