The Difference in Structural States between Canonical Proteins and Their Isoforms Established by Proteome-Wide Bioinformatics Analysis

Biomolecules. 2022 Nov 1;12(11):1610. doi: 10.3390/biom12111610.

Abstract

Alternative splicing is an important means of generating the protein diversity necessary for cellular functions. Hence, there is a growing interest in assessing the structural and functional impact of alternative protein isoforms. Typically, experimental studies are used to determine the structures of the canonical proteins ignoring the other isoforms. Therefore, there is still a large gap between abundant sequence information and meager structural data on these isoforms. During the last decade, significant progress has been achieved in the development of bioinformatics tools for structural and functional annotations of proteins. Moreover, the appearance of the AlphaFold program opened up the possibility to model a large number of high-confidence structures of the isoforms. In this study, using state-of-the-art tools, we performed in silico analysis of 58 eukaryotic proteomes. The evaluated structural states included structured domains, intrinsically disordered regions, aggregation-prone regions, and tandem repeats. Among other things, we found that the isoforms have fewer signal peptides, transmembrane regions, or tandem repeat regions in comparison with their canonical counterparts. This could change protein function and/or cellular localization. The AlphaFold modeling demonstrated that frequently isoforms, having differences with the canonical sequences, still can fold in similar structures though with significant structural rearrangements which can lead to changes of their functions. Based on the modeling, we suggested classification of the structural differences between canonical proteins and isoforms. Altogether, we can conclude that a majority of isoforms, similarly to the canonical proteins are under selective pressure for the functional roles.

Keywords: AlphaFold; canonical protein; isoform; large-scale analysis; protein structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Computational Biology*
  • Protein Isoforms / chemistry
  • Protein Isoforms / genetics
  • Proteome* / genetics

Substances

  • Proteome
  • Protein Isoforms