Digging into the 3D Structure Predictions of AlphaFold2 with Low Confidence: Disorder and Beyond

Biomolecules. 2022 Oct 13;12(10):1467. doi: 10.3390/biom12101467.

Abstract

AlphaFold2 (AF2) has created a breakthrough in biology by providing three-dimensional structure models for whole-proteome sequences, with unprecedented levels of accuracy. In addition, the AF2 pLDDT score, related to the model confidence, has been shown to provide a good measure of residue-wise disorder. Here, we combined AF2 predictions with pyHCA, a tool we previously developed to identify foldable segments and estimate their order/disorder ratio, from a single protein sequence. We focused our analysis on the AF2 predictions available for 21 reference proteomes (AFDB v1), in particular on their long foldable segments (>30 amino acids) that exhibit characteristics of soluble domains, as estimated by pyHCA. Among these segments, we provided a global analysis of those with very low pLDDT values along their entire length and compared their characteristics to those of segments with very high pLDDT values. We highlighted cases containing conditional order, as well as cases that could form well-folded structures but escape the AF2 prediction due to a shallow multiple sequence alignment and/or undocumented structure or fold. AF2 and pyHCA can therefore be advantageously combined to unravel cryptic structural features in whole proteomes and to refine predictions for different flavors of disorder.

Keywords: conditional order; dark proteomes; hidden order; intrinsically disordered domains; long foldable segments; protein sequence; pyHCA; soluble domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / chemistry
  • Furylfuramide*
  • Protein Conformation
  • Proteome* / chemistry
  • Sequence Alignment

Substances

  • Proteome
  • Furylfuramide
  • Amino Acids

Grants and funding

A.B. was supported by the PhD program of Doctoral School “Complexité du Vivant” (ED515, Sorbonne Université). This work was supported by the French National Research Agency (PHOSTORE: ANR-19-CE01-0005 and APOTHESIS: ANR-21-CE12-0021).