Recent Progress of Protein Tertiary Structure Prediction

Molecules. 2024 Feb 13;29(4):832. doi: 10.3390/molecules29040832.

Abstract

The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.

Keywords: AlphaFold2; contact map; deep learning; distance map; end-to-end methods; multi-domain proteins; protein language model; protein tertiary structure prediction; template-based modeling; template-free modeling.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Computational Biology / methods
  • Databases, Protein
  • Models, Molecular
  • Protein Conformation
  • Protein Folding
  • Proteins* / chemistry
  • Software

Substances

  • Proteins