PhytoTypeDB: a database of plant protein inter-cultivar variability and function

Marco Necci; Damiano Piovesan; Diego Micheletti; Lisanna Paladin; Alessandro Cestaro; Silvio C E Tosatto

doi:10.1093/database/bay125

PhytoTypeDB: a database of plant protein inter-cultivar variability and function

Database (Oxford). 2018 Jan 1:2018:bay125. doi: 10.1093/database/bay125.

Authors

Marco Necci^{1

2

3}, Damiano Piovesan¹, Diego Micheletti³, Lisanna Paladin¹, Alessandro Cestaro³, Silvio C E Tosatto^{1

4}

Affiliations

¹ Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, Padua, Italy.
² Department of Agricultural Sciences, University of Udine, via Palladio 8, Udine, Italy.
³ Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige, Italy.
⁴ Consiglio Nazionale delle Ricerche Institute of Neuroscience, via U. Bassi 58/b, Padua, Italy.

Abstract

Despite a fast-growing number of available plant genomes, available computational resources are poorly integrated and provide only limited access to the underlying data. Most existing databases focus on DNA/RNA data or specific gene families, with less emphasis on protein structure, function and variability. In particular, despite the economic importance of many plant accessions, there are no straightforward ways to retrieve or visualize information on their differences. To fill this gap, we developed PhytoTypeDB (http://phytotypedb.bio.unipd.it/), a scalable database containing plant protein annotations and genetic variants from resequencing of different accessions. The database content is generated by an integrated pipeline, exploiting state-of-the-art methods for protein characterization requiring only the proteome reference sequence and variant calling files. Protein names for unknown proteins are inferred by homology for over 95% of the entries. Single-nucleotide variants are visualized along with protein annotation in a user-friendly web interface. The server offers an effective querying system, which allows to compare variability among different species and accessions, to generate custom data sets based on shared functional features or to perform sequence searches. A documented set of exposed RESTful endpoints make the data accessible programmatically by third-party clients.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Databases, Protein*
Internet
Molecular Sequence Annotation
Plant Proteins* / genetics
Plant Proteins* / physiology
Proteome / genetics
Proteome / physiology
Proteomics
User-Computer Interface

Substances

Plant Proteins
Proteome