Mining high-throughput experimental data to link gene and function

Crysten E Blaby-Haas; Valérie de Crécy-Lagard

doi:10.1016/j.tibtech.2011.01.001

Mining high-throughput experimental data to link gene and function

Trends Biotechnol. 2011 Apr;29(4):174-82. doi: 10.1016/j.tibtech.2011.01.001.

Authors

Crysten E Blaby-Haas¹, Valérie de Crécy-Lagard

Affiliation

¹ Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA.

Abstract

Nearly 2200 genomes that encode around 6 million proteins have now been sequenced. Around 40% of these proteins are of unknown function, even when function is loosely and minimally defined as 'belonging to a superfamily'. In addition to in silico methods, the swelling stream of high-throughput experimental data can give valuable clues for linking these unknowns with precise biological roles. The goal is to develop integrative data-mining platforms that allow the scientific community at large to access and utilize this rich source of experimental knowledge. To this end, we review recent advances in generating whole-genome experimental datasets, where this data can be accessed, and how it can be used to drive prediction of gene function.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.
Review

MeSH terms

Animals
Computer Simulation
Data Mining / methods*
Genes*
Genomics / methods*
High-Throughput Nucleotide Sequencing / methods*
Humans
Mice
Phenotype

Abstract

Publication types

MeSH terms

Grants and funding