Low Complexity Induces Structure in Protein Regions Predicted as Intrinsically Disordered

Biomolecules. 2022 Aug 10;12(8):1098. doi: 10.3390/biom12081098.

Abstract

There is increasing evidence that many intrinsically disordered regions (IDRs) in proteins play key functional roles through interactions with other proteins or nucleic acids. These interactions often exhibit a context-dependent structural behavior. We hypothesize that low complexity regions (LCRs), often found within IDRs, could have a role in inducing local structure in IDRs. To test this, we predicted IDRs in the human proteome and analyzed their structures or those of homologous sequences in the Protein Data Bank (PDB). We then identified two types of simple LCRs within IDRs: regions with only one (polyX or homorepeats) or with only two types of amino acids (polyXY). We were able to assign structural information from the PDB more often to these LCRs than to the surrounding IDRs (polyX 61.8% > polyXY 50.5% > IDRs 39.7%). The most frequently observed polyX and polyXY within IDRs contained E (Glu) or G (Gly). Structural analyses of these sequences and of homologs indicate that polyEK regions induce helical conformations, while the other most frequent LCRs induce coil structures. Our work proposes bioinformatics methods to help in the study of the structural behavior of IDRs and provides a solid basis suggesting a structuring role of LCRs within them.

Keywords: homorepeats; intrinsically disordered regions; low complexity regions; protein structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids
  • Computational Biology
  • Databases, Protein
  • Humans
  • Intrinsically Disordered Proteins* / chemistry
  • Protein Conformation
  • Protein Domains
  • Proteins* / chemistry

Substances

  • Amino Acids
  • Intrinsically Disordered Proteins
  • Proteins

Grants and funding

This work received support by the following institutions: Mainz Institute of Multiscale Modeling (M3ODEL) to F.S. and M.A.A.N. for the position of M.G.K.; European Research Council under the European Union’s H2020 Framework Programme (2014–2020)/ERC Grant agreement n° [648030] and Labex EpiGenMed, an “Investissements d’avenir” program (ANR-10-LABX-12-01) awarded to P.B.; French National Research Agency through grant ANR-19-P3IA-0004 to J.C. The Centre de Biologie Structurale (CBS) is a member of France-BioImaging (FBI) and the French Infrastructure for Integrated Structural Biology (FRISBI), two national infrastructures supported by the French National Research Agency (ANR-10-INBS-04-01 and ANR-10-INBS-05, respectively).