PITDB: a database of translated genomic elements

Shyamasree Saha; Eleni A Chatzimichali; David A Matthews; Conrad Bessant

doi:10.1093/nar/gkx906

PITDB: a database of translated genomic elements

Nucleic Acids Res. 2018 Jan 4;46(D1):D1223-D1228. doi: 10.1093/nar/gkx906.

Authors

Shyamasree Saha¹, Eleni A Chatzimichali¹, David A Matthews², Conrad Bessant^{1

3}

Affiliations

¹ School of Biological and Chemical Sciences, Queen Mary University of London, Mile End, London E1 4NS, UK.
² School of Cellular and Molecular Medicine, University of Bristol, University Walk, Bristol BS8 1TD, UK.
³ Centre for Computational Biology, Life Science Institute, Queen Mary University of London, Mile End, London E1 4NS, UK.

Abstract

PITDB is a freely available database of translated genomic elements (TGEs) that have been observed in PIT (proteomics informed by transcriptomics) experiments. In PIT, a sample is analyzed using both RNA-seq transcriptomics and proteomic mass spectrometry. Transcripts assembled from RNA-seq reads are used to create a library of sample-specific amino acid sequences against which the acquired mass spectra are searched, permitting detection of any TGE, not just those in canonical proteome databases. At the time of writing, PITDB contains over 74 000 distinct TGEs from four species, supported by more than 600 000 peptide spectrum matches. The database, accessible via http://pitdb.org, provides supporting evidence for each TGE, often from multiple experiments and an indication of the confidence in the TGE's observation and its type, ranging from known protein (exact match to a UniProt protein sequence), through multiple types of protein variant including various splice isoforms, to a putative novel molecule. PITDB's modern web interface allows TGEs to be viewed individually or by species or experiment, and downloaded for further analysis. PITDB is for bench scientists seeking to share their PIT results, for researchers investigating novel genome products in model organisms and for those wishing to construct proteomes for lesser studied species.

Publication types

Dataset
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Amino Acid Sequence
Animals
Data Display
Databases, Factual*
Humans
Internet
Open Reading Frames
Protein Biosynthesis
Protein Isoforms / genetics
Proteins / chemistry*
Proteins / genetics*
Proteomics / methods
Sequence Analysis, RNA*
Tandem Mass Spectrometry
User-Computer Interface

Substances

Protein Isoforms
Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding