Gamma-Normal-Gamma mixture model for detecting differentially methylated loci in three breast cancer cell lines

Cancer Inform. 2007 Feb 7:3:43-54.

Abstract

With state-of-the-art microarray technologies now available for whole genome CpG island (CGI) methylation profiling, there is a need to develop statistical models that are specifically geared toward the analysis of such data. In this article, we propose a Gamma-Normal-Gamma (GNG) mixture model for describing three groups of CGI loci: hypomethylated, undifferentiated, and hypermethylated, from a single methylation microarray. This model was applied to study the methylation signatures of three breast cancer cell lines: MCF7, T47D, and MDAMB361. Biologically interesting and interpretable results are obtained, which highlights the heterogeneity nature of the three cell lines. This underlies the premise for the need of analyzing each of the microarray slides individually as opposed to pooling them together for a single analysis. Our comparisons with the fitted densities from the Normal-Uniform (NU) mixture model in the literature proposed for gene expression analysis show an improved goodness of fit of the GNG model over the NU model. Although the GNG model was proposed in the context of single-slide methylation analysis, it can be readily adapted to analyze multi-slide methylation data as well as other types of microarray data.

Keywords: CpG islands; breast cancer cell lines; methylation/epigenetic signature; microarrays; mixture modeling.