Assessing PDB macromolecular crystal structure confidence at the individual amino acid residue level

Structure. 2022 Oct 6;30(10):1385-1394.e3. doi: 10.1016/j.str.2022.08.004. Epub 2022 Aug 31.

Abstract

Approximately 87% of the more than 190,000 atomic-level three-dimensional (3D) biostructures in the PDB were determined using macromolecular crystallography (MX). Agreement between 3D atomic coordinates and experimental data for >100 million individual amino acid residues occurring within ∼150,000 PDB MX structures was analyzed in detail. The real-space correlation coefficient (RSCC) calculated using the 3D atomic coordinates for each residue and experimental-data-derived electron density enables outlier detection of unreliable atomic coordinates (particularly important for poorly resolved side-chain atoms) and ready evaluation of local structure quality by PDB users. For human protein MX structures in PDB, comparisons of the per-residue RSCC metric with AlphaFold2-computed structure model confidence (pLDDT-predicted local distance difference test) document (1) that RSCC values and pLDDT scores are correlated (median correlation coefficient ∼0.41), and (2) that experimentally determined MX structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models and should be used preferentially whenever possible.

Keywords: AlphaFold2; AlphaFoldDB; PDB; Protein Data Bank; RCSB PDB; RSCC; macromolecular structure quality; pLDDT; real-space correlation coefficient; structure confidence.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acids*
  • Databases, Protein
  • Humans
  • Macromolecular Substances
  • Myxovirus Resistance Proteins
  • Protein Conformation

Substances

  • Amino Acids
  • Macromolecular Substances
  • Myxovirus Resistance Proteins