Investigating the impact of Down syndrome on methylation and glycomics with two-stage PO2PLS

Theor Biol Forum. 2021 Jan 1;114(1-2):29-44. doi: 10.19272/202111402004.

Abstract

Down syndrome (DS) is a condition that leads to precocious and accelerated aging in affected subjects. Several alterations in DS cases have been reported at a molecular level, particularly in methylation and glycosylation. Investigating the relation between methylation, glycomics and DS can lead to new insights underlying the atypical aging. We consider a data integration approach, where we investigate how DS affects the parts of glycomics and methylation which are correlated, and which CpG sites and glycans are relevant. Our motivating datasets consist of methylation and glycomics data, measured on 29 DS patients and their unaffected siblings and mothers. The family-based case-control design needs to be taken into account when studying the relationship between methylation, glycomics and DS. We propose a two-stage approach to first integrate methylation and glycomics data, and then link the joint information to Down syndrome. For the data integration step, we consider probabilistic two-way orthogonal partial least squares (PO2PLS). PO2PLS models two omics datasets in terms of low-dimensional joint and omic-specific latent components, and takes into account heterogeneity across the omics data. The relationship between the omics data can be statistically tested. The joint components represent the joint information in methylation and glycomics. In the second stage, we apply a linear mixed model to the relationship between DS and the joint methylation and glycomics components. For the components that are significantly as sociated with DS, we identify the most important CpG sites and glycans. A simulation study is conducted to evaluate the performance of our approach. The results showed that the effects of DS on the omics data can be detected in a large sample size, and the accuracy of the feature selection was high in both small and large sample sizes. Our approach is applied to the DS datasets, a significant effect of DS on the joint components is found. The identified CpG sites and glycans appeared to be related to DS. Our proposed method that jointly analyzes multiple omics data with an outcome variable may provide new insight into the molecular implications of DS at different omics levels.

Keywords: Down Syndrome; Omics Data Integration; Probabilistic O2PLS.

MeSH terms

  • DNA Methylation
  • Down Syndrome* / genetics
  • Female
  • Glycomics* / methods
  • Humans
  • Polysaccharides
  • Protein Processing, Post-Translational

Substances

  • Polysaccharides