Pipasic: similarity and expression correction for strain-level identification and quantification in metaproteomics

Anke Penzlin; Martin S Lindner; Joerg Doellinger; Piotr Wojtek Dabrowski; Andreas Nitsche; Bernhard Y Renard

doi:10.1093/bioinformatics/btu267

Pipasic: similarity and expression correction for strain-level identification and quantification in metaproteomics

Bioinformatics. 2014 Jun 15;30(12):i149-56. doi: 10.1093/bioinformatics/btu267.

Authors

Anke Penzlin¹, Martin S Lindner¹, Joerg Doellinger², Piotr Wojtek Dabrowski², Andreas Nitsche¹, Bernhard Y Renard¹

Affiliations

¹ Research Group Bioinformatics (NG4), Centre for Biological Threats and Special Pathogens 1 (ZBS 1), Centre for Biological Threats and Special Pathogens 6 (ZBS 6) and Central Administration 4 (IT), Robert Koch Institute, 13353 Berlin, Germany.
² Research Group Bioinformatics (NG4), Centre for Biological Threats and Special Pathogens 1 (ZBS 1), Centre for Biological Threats and Special Pathogens 6 (ZBS 6) and Central Administration 4 (IT), Robert Koch Institute, 13353 Berlin, GermanyResearch Group Bioinformatics (NG4), Centre for Biological Threats and Special Pathogens 1 (ZBS 1), Centre for Biological Threats and Special Pathogens 6 (ZBS 6) and Central Administration 4 (IT), Robert Koch Institute, 13353 Berlin, Germany.

Abstract

Motivation: Metaproteomic analysis allows studying the interplay of organisms or functional groups and has become increasingly popular also for diagnostic purposes. However, difficulties arise owing to the high sequence similarity between related organisms. Further, the state of conservation of proteins between species can be correlated with their expression level, which can lead to significant bias in results and interpretation. These challenges are similar but not identical to the challenges arising in the analysis of metagenomic samples and require specific solutions.

Results: We introduce Pipasic (peptide intensity-weighted proteome abundance similarity correction) as a tool that corrects identification and spectral counting-based quantification results using peptide similarity estimation and expression level weighting within a non-negative lasso framework. Pipasic has distinct advantages over approaches only regarding unique peptides or aggregating results to the lowest common ancestor, as demonstrated on examples of viral diagnostics and an acid mine drainage dataset.

Availability and implementation: Pipasic source code is freely available from https://sourceforge.net/projects/pipasic/.

Contact: RenardB@rki.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Bacterial Proteins / metabolism
Cowpox virus / classification
Environmental Microbiology*
Mass Spectrometry
Peptides / chemistry
Proteome / chemistry
Proteome / metabolism*
Proteomics / methods*
Software

Substances

Bacterial Proteins
Peptides
Proteome