Sequence-to-expression approach to identify etiological non-coding DNA variations in P53 and cMYC-driven diseases

Res Sq [Preprint]. 2023 Jul 12:rs.3.rs-3037310. doi: 10.21203/rs.3.rs-3037310/v1.

Abstract

Background and methods: Disease risk prediction based on DNA sequence and transcriptional profile can improve disease screening, prevention, and potential therapeutic approaches by revealing contributing genetic factors and altered regulatory networks. Despite identifying many disease-associated DNA variants through genome-wide association studies, distinguishing deleterious non-coding DNA variations remains poor for most common diseases. We previously reported that non-coding variations disrupting cis-overlapping motifs (CisOMs) of opposing transcription factors significantly affect enhancer activity. We designed in vitro experiments to uncover the significance of the co-occupancy and competitive binding and inhibition between P53 and cMYC on common target gene expression.

Results: Analyzing publicly available ChIP-seq data for P53 and cMYC in human embryonic stem cells and mouse embryonic cells showed that ~ 344-366 genomic regions are co-occupied by P53 and cMYC. We identified, on average, two CisOMs per region, suggesting that co-occupancy is evolutionarily conserved in vertebrates. Our data showed that treating U2OS cells with doxorubicin increased P53 protein level while reducing cMYC level. In contrast, no change in protein levels was observed in Raji cells. ChIP-seq analysis illustrated that 16-922 genomic regions were co-occupied by P53 and cMYC before and after treatment, and substitutions of cMYC signals by P53 were detected after doxorubicin treatment in U2OS. Around 187 expressed genes near co-occupied regions were altered at mRNA level according to RNA-seq data. We utilized a computational motif-matching approach to determine that changes in predicted P53 binding affinity by DNA variations in CisOMs of co-occupied elements significantly correlate with alterations in reporter gene expression. We performed a similar analysis using SNPs mapped in CisOMs for P53 and cMYC from ChIP-seq data in U2OS and Raji, and expression of target genes from the GTEx portal.

Conclusions: We found a significant correlation between change in motif-predicted cMYC binding affinity by SNPs in CisOMs and altered gene expression. Our study brings us closer to developing a generally applicable approach to filter etiological non-coding variations associated with P53 and cMYC-dependent diseases.

Keywords: Non-coding DNA variants; common complex diseases; computational tools; proto-oncogene; tumor suppressor.

Publication types

  • Preprint