FLIPPER: Predicting and Characterizing Linear Interacting Peptides in the Protein Data Bank

J Mol Biol. 2021 Apr 30;433(9):166900. doi: 10.1016/j.jmb.2021.166900. Epub 2021 Feb 27.

Abstract

A large fraction of peptides or protein regions are disordered in isolation and fold upon binding. These regions, also called MoRFs, SLiMs or LIPs, are often associated with signaling and regulation processes. However, despite their importance, only a limited number of examples are available in public databases and their automatic detection at the proteome level is problematic. Here we present FLIPPER, an automatic method for the detection of structurally linear sub-regions or peptides that interact with another chain in a protein complex. FLIPPER is a random forest classification that takes the protein structure as input and provides the propensity of each amino acid to be part of a LIP region. Models are built taking into consideration structural features such as intra- and inter-chain contacts, secondary structure, solvent accessibility in both bound and unbound state, structural linearity and chain length. FLIPPER is accurate when evaluated on non-redundant independent datasets, 99% precision and 99% sensitivity on PixelDB-25 and 87% precision and 88% sensitivity on DIBS-25. Finally, we used FLIPPER to process the entire Protein Data Bank and identified different classes of LIPs based on different binding modes and partner molecules. We provide a detailed description of these LIP categories and show that a large fraction of these regions are not detected by disorder predictors. All FLIPPER predictions are integrated in the MobiDB 4.0 database.

Keywords: binding modes prediction; intrinsic disorder; linear interacting peptides; machine learning; protein structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Databases, Protein*
  • Datasets as Topic
  • Humans
  • Models, Molecular
  • Nucleic Acids / chemistry
  • Peptides / chemistry*
  • Peptides / metabolism*
  • Protein Binding
  • Protein Folding*
  • Protein Structure, Secondary

Substances

  • Nucleic Acids
  • Peptides