Contribution of microarray data to the advancement of knowledge on the Mycobacterium tuberculosis interactome: use of the random partial least squares approach

Infect Genet Evol. 2011 Jun;11(4):725-33. doi: 10.1016/j.meegid.2011.04.012. Epub 2011 Apr 14.

Abstract

Following the central dogma of molecular biology, where data flows from gene to protein through transcript, information on gene expression provides information on the functional state of an organism. Microarray technology arose to measure the expression level of thousands of genes simultaneously. These vast amounts of data generated at all levels of biological organization help to identify co-expressed genes, which may reveal proteins interacting in a complex or acting in the same pathway without direct physical contact. Discovering associations of regulatory patterns of characterized proteins with those of hypothetical proteins may identify functional relationships between them and facilitate the characterization of proteins of unknown function. Here we make use of the random partial least squares regression technique (r-PLS) to trace connections between co-expressed genes in Mycobacterium tuberculosis using data downloaded from public microarray databases. We generated the overall topology of a microbial co-expression network with the exact complexity of the model. This approach provides a general method for generating a co-expression network of an organism for the purpose of systems-level analyses.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Computer Simulation
  • Databases, Genetic
  • Gene Expression Profiling*
  • Gene Expression Regulation, Bacterial*
  • Humans
  • Least-Squares Analysis
  • Models, Biological
  • Mycobacterium tuberculosis / genetics*
  • Mycobacterium tuberculosis / metabolism*
  • Oligonucleotide Array Sequence Analysis*
  • Protein Binding / physiology

Substances

  • Bacterial Proteins