CROP: correlation-based reduction of feature multiplicities in untargeted metabolomic data

Bioinformatics. 2020 May 1;36(9):2941-2942. doi: 10.1093/bioinformatics/btaa012.

Abstract

Summary: Untargeted liquid chromatography-high-resolution mass spectrometry analysis produces a large number of features which correspond to the potential compounds in the sample that is analyzed. During the data processing, it is necessary to merge features associated with one compound to prevent multiplicities in the data and possible misidentification. The processing tools that are currently employed use complex algorithms to detect abundances, such as adducts or isotopes. However, most of them are not able to deal with unpredictable adducts and in-source fragments. We introduce a simple open-source R-script CROP based on Pearson pairwise correlations and retention time together with a graphical representation of the correlation network to remove these redundant features.

Availability and implementation: The CROP R-script is available online at www.github.com/rendju/CROP under GNU GPL.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Chromatography, Liquid
  • Mass Spectrometry
  • Metabolomics*
  • Software*