LEMming: A Linear Error Model to Normalize Parallel Quantitative Real-Time PCR (qPCR) Data as an Alternative to Reference Gene Based Methods

PLoS One. 2015 Sep 1;10(9):e0135852. doi: 10.1371/journal.pone.0135852. eCollection 2015.

Abstract

Background: Gene expression analysis is an essential part of biological and medical investigations. Quantitative real-time PCR (qPCR) is characterized with excellent sensitivity, dynamic range, reproducibility and is still regarded to be the gold standard for quantifying transcripts abundance. Parallelization of qPCR such as by microfluidic Taqman Fluidigm Biomark Platform enables evaluation of multiple transcripts in samples treated under various conditions. Despite advanced technologies, correct evaluation of the measurements remains challenging. Most widely used methods for evaluating or calculating gene expression data include geNorm and ΔΔCt, respectively. They rely on one or several stable reference genes (RGs) for normalization, thus potentially causing biased results. We therefore applied multivariable regression with a tailored error model to overcome the necessity of stable RGs.

Results: We developed a RG independent data normalization approach based on a tailored linear error model for parallel qPCR data, called LEMming. It uses the assumption that the mean Ct values within samples of similarly treated groups are equal. Performance of LEMming was evaluated in three data sets with different stability patterns of RGs and compared to the results of geNorm normalization. Data set 1 showed that both methods gave similar results if stable RGs are available. Data set 2 included RGs which are stable according to geNorm criteria, but became differentially expressed in normalized data evaluated by a t-test. geNorm-normalized data showed an effect of a shifted mean per gene per condition whereas LEMming-normalized data did not. Comparing the decrease of standard deviation from raw data to geNorm and to LEMming, the latter was superior. In data set 3 according to geNorm calculated average expression stability and pairwise variation, stable RGs were available, but t-tests of raw data contradicted this. Normalization with RGs resulted in distorted data contradicting literature, while LEMming normalized data did not.

Conclusions: If RGs are coexpressed but are not independent of the experimental conditions the stability criteria based on inter- and intragroup variation fail. The linear error model developed, LEMming, overcomes the dependency of using RGs for parallel qPCR measurements, besides resolving biases of both technical and biological nature in qPCR. However, to distinguish systematic errors per treated group from a global treatment effect an additional measurement is needed. Quantification of total cDNA content per sample helps to identify systematic errors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Datasets as Topic
  • Gene Expression Profiling
  • Genes
  • Humans
  • Linear Models
  • Mice
  • Real-Time Polymerase Chain Reaction / standards*
  • Reference Standards

Grants and funding

This work was funded by the Bundesministerium fuer Bildung und Forschung (BMBF)-Virtual Liver Network (grant FKZ 0315755, FKZ 0315736, FKZ 0315765, FKZ 0315751)-and Robert Bosch Foundation, Stuttgart and supported by the DFG funding program Open Access Publishing, Deutsche Forschungsgemeinschaft (Grant support: Klinische Forschergruppe 368 117-Optimierung der Leberlebendspende Grant number: Da251/5-2 und 3, Project B2, KFO117). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.