Plant-PrAS: a database of physicochemical and structural properties and novel functional regions in plant proteomes

Plant Cell Physiol. 2015 Jan;56(1):e11. doi: 10.1093/pcp/pcu176. Epub 2014 Nov 29.

Abstract

Arabidopsis thaliana is an important model species for studies of plant gene functions. Research on Arabidopsis has resulted in the generation of high-quality genome sequences, annotations and related post-genomic studies. The amount of annotation, such as gene-coding regions and structures, is steadily growing in the field of plant research. In contrast to the genomics resource of animals and microorganisms, there are still some difficulties with characterization of some gene functions in plant genomics studies. The acquisition of information on protein structure can help elucidate the corresponding gene function because proteins encoded in the genome possess highly specific structures and functions. In this study, we calculated multiple physicochemical and secondary structural parameters of protein sequences, including length, hydrophobicity, the amount of secondary structure, the number of intrinsically disordered regions (IDRs) and the predicted presence of transmembrane helices and signal peptides, using a total of 208,333 protein sequences from the genomes of six representative plant species, Arabidopsis thaliana, Glycine max (soybean), Populus trichocarpa (poplar), Oryza sativa (rice), Physcomitrella patens (moss) and Cyanidioschyzon merolae (alga). Using the PASS tool and the Rosetta Stone method, we annotated the presence of novel functional regions in 1,732 protein sequences that included unannotated sequences from the Arabidopsis and rice proteomes. These results were organized into the Plant Protein Annotation Suite database (Plant-PrAS), which can be freely accessed online at http://plant-pras.riken.jp/.

Keywords: Database; Gene function; Physicochemical property; Plant protein; Protein property.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics
  • Arabidopsis / metabolism
  • Bryopsida / genetics
  • Bryopsida / metabolism
  • Chromosome Mapping
  • Databases, Protein*
  • Information Storage and Retrieval*
  • Internet
  • Molecular Sequence Annotation
  • Open Reading Frames
  • Oryza / genetics
  • Oryza / metabolism
  • Plant Proteins / chemistry*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Plants / genetics
  • Plants / metabolism*
  • Populus / genetics
  • Populus / metabolism
  • Proteome*
  • Rhodophyta / genetics
  • Rhodophyta / metabolism

Substances

  • Plant Proteins
  • Proteome