Ionmob: a Python package for prediction of peptide collisional cross-section values

Bioinformatics. 2023 Sep 2;39(9):btad486. doi: 10.1093/bioinformatics/btad486.

Abstract

Motivation: Including ion mobility separation (IMS) into mass spectrometry proteomics experiments is useful to improve coverage and throughput. Many IMS devices enable linking experimentally derived mobility of an ion to its collisional cross-section (CCS), a highly reproducible physicochemical property dependent on the ion's mass, charge and conformation in the gas phase. Thus, known peptide ion mobilities can be used to tailor acquisition methods or to refine database search results. The large space of potential peptide sequences, driven also by posttranslational modifications of amino acids, motivates an in silico predictor for peptide CCS. Recent studies explored the general performance of varying machine-learning techniques, however, the workflow engineering part was of secondary importance. For the sake of applicability, such a tool should be generic, data driven, and offer the possibility to be easily adapted to individual workflows for experimental design and data processing.

Results: We created ionmob, a Python-based framework for data preparation, training, and prediction of collisional cross-section values of peptides. It is easily customizable and includes a set of pretrained, ready-to-use models and preprocessing routines for training and inference. Using a set of ≈21 000 unique phosphorylated peptides and ≈17 000 MHC ligand sequences and charge state pairs, we expand upon the space of peptides that can be integrated into CCS prediction. Lastly, we investigate the applicability of in silico predicted CCS to increase confidence in identified peptides by applying methods of re-scoring and demonstrate that predicted CCS values complement existing predictors for that task.

Availability and implementation: The Python package is available at github: https://github.com/theGreatHerrLebert/ionmob.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Ions
  • Machine Learning*
  • Mass Spectrometry / methods
  • Peptides* / chemistry
  • Proteomics / methods

Substances

  • Peptides
  • Ions