Iterative Machine Learning for Classification and Discovery of Single-Molecule Unfolding Trajectories from Force Spectroscopy Data

Nano Lett. 2023 Nov 22;23(22):10406-10413. doi: 10.1021/acs.nanolett.3c03026. Epub 2023 Nov 7.

Abstract

We report the application of machine learning techniques to expedite classification and analysis of protein unfolding trajectories from force spectroscopy data. Using kernel methods, logistic regression, and triplet loss, we developed a workflow called Forced Unfolding and Supervised Iterative Online (FUSION) learning where a user classifies a small number of repeatable unfolding patterns encoded as images, and a machine is tasked with identifying similar images to classify the remaining data. We tested the workflow using two case studies on a multidomain XMod-Dockerin/Cohesin complex, validating the approach first using synthetic data generated with a Monte Carlo algorithm and then deploying the method on experimental atomic force spectroscopy data. FUSION efficiently separated traces that passed quality filters from unusable ones, classified curves with high accuracy, and identified unfolding pathways that were undetected by the user. This study demonstrates the potential of machine learning to accelerate data analysis and generate new insights in protein biophysics.

Keywords: atomic force microscopy; data analysis; iterative screening; machine learning; single-molecule biophysics.

MeSH terms

  • Machine Learning
  • Mechanical Phenomena*
  • Microscopy, Atomic Force / methods
  • Proteins* / chemistry
  • Spectrum Analysis

Substances

  • Proteins