Clustering and visualizing similarity networks of membrane proteins

Proteins. 2015 Aug;83(8):1450-61. doi: 10.1002/prot.24832. Epub 2015 Jun 6.

Abstract

We proposed a fast and unsupervised clustering method, minimum span clustering (MSC), for analyzing the sequence-structure-function relationship of biological networks, and demonstrated its validity in clustering the sequence/structure similarity networks (SSN) of 682 membrane protein (MP) chains. The MSC clustering of MPs based on their sequence information was found to be consistent with their tertiary structures and functions. For the largest seven clusters predicted by MSC, the consistency in chain function within the same cluster is found to be 100%. From analyzing the edge distribution of SSN for MPs, we found a characteristic threshold distance for the boundary between clusters, over which SSN of MPs could be properly clustered by an unsupervised sparsification of the network distance matrix. The clustering results of MPs from both MSC and the unsupervised sparsification methods are consistent with each other, and have high intracluster similarity and low intercluster similarity in sequence, structure, and function. Our study showed a strong sequence-structure-function relationship of MPs. We discussed evidence of convergent evolution of MPs and suggested applications in finding structural similarities and predicting biological functions of MP chains based on their sequence information.

Keywords: membrane proteins; network clustering; protein function; protein similarity networks; protein structure; sequence homology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Computational Biology / methods*
  • Databases, Protein
  • Markov Chains
  • Membrane Proteins / chemistry
  • Membrane Proteins / classification*
  • Membrane Proteins / physiology
  • Sequence Homology, Amino Acid

Substances

  • Membrane Proteins