UniprotR: Retrieving and visualizing protein sequence and functional information from Universal Protein Resource (UniProt knowledgebase)

J Proteomics. 2020 Feb 20:213:103613. doi: 10.1016/j.jprot.2019.103613. Epub 2019 Dec 14.

Abstract

UniprotR is a software package designed to easily retrieve, cluster and visualize protein data from UniProt knowledgebase (UniProtKB) using R language. The package is implemented mainly to process, parse and illustrate proteomics data in a handy and time-saving approach allowing researchers to summarize all required protein information available at UniProtKB in a readable data frame, Excel CSV file, and/or graphical output. UniprotR generates a set of graphics including gene ontology, chromosomal location, protein scoring and status, protein networking, sequence phylogenetic tree, and physicochemical properties. In addition, the package supports clustering of proteins based on primary gene name or chromosomal location, facilitating additional downstream analysis. SIGNIFICANCE: In this work, we implemented a robust package for retrieving and visualizing information from multiple sources such UniProtKB, SWISS-MODEL, and STRING. UniprotR Contains functions that enable retrieving and cluster data in a handy way and visualize data in publishable graphs to facilitate researcher's work and fulfill their needs. UniprotR will aid in saving time for downstream data analysis instead of manual time consuming data analysis. AVAILABILITY AND IMPLEMENTATION: UniprotR released as free open source code under the license of GPLv3, and available in CRAN (The Comprehensive R Archive Network) and GitHub. (https://cran.r-project.org/web/packages/UniprotR/index.html). (https://github.com/Proteomicslab57357/UniprotR).

Keywords: Bioinformatics; Proteomics; R package; UniProt; UniProtKB.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence*
  • Knowledge Bases*
  • Phylogeny*
  • Proteins / genetics
  • Software*

Substances

  • Proteins