LPATH: A Semiautomated Python Tool for Clustering Molecular Pathways

J Chem Inf Model. 2023 Dec 25;63(24):7610-7616. doi: 10.1021/acs.jcim.3c01318. Epub 2023 Dec 4.

Abstract

The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here, we present the LPATH Python tool, which implements a semiautomated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of the alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.

MeSH terms

  • Cluster Analysis
  • Dipeptides* / chemistry
  • Molecular Conformation
  • Molecular Dynamics Simulation*
  • Software

Substances

  • Dipeptides