Differential allelic representation (DAR) identifies candidate eQTLs and improves transcriptome analysis

bioRxiv [Preprint]. 2023 Mar 21:2023.03.02.530865. doi: 10.1101/2023.03.02.530865.

Abstract

In comparisons between mutant and wild-type genotypes, transcriptome analysis can reveal the direct impacts of a mutation, together with the homeostatic responses of the biological system. Recent studies have highlighted that, when homozygous mutations are studied in non-isogenic backgrounds, genes from the same chromosome as a mutation often appear over-represented among differentially expressed (DE) genes. One hypothesis suggests that DE genes chromosomally linked to a mutation may not reflect true biological responses to the mutation but, instead, result from differences in representation of expression quantitative trait loci (eQTLs) between sample groups selected on the basis of mutant or wild-type genotype. This is problematic when inclusion of spurious DE genes in a functional enrichment study results in incorrect inferences of mutation effect. Here we show that chromosomally co-located differentially expressed genes (CC-DEGs) can also be observed in analyses of dominant mutations in heterozygotes. We define a method and a metric to quantify, in RNA-sequencing data, localised differential allelic representation (DAR) between groups of samples subject to differential expression analysis. We show how the DAR metric can predict regions prone to eQTL-driven differential expression, and how it can improve functional enrichment analyses through gene exclusion or weighting of gene-level rankings. Advantageously, this improved ability to identify probable eQTLs also reveals examples of CC-DEGs that are likely to be functionally related to a mutant phenotype. This supports a long-standing prediction that selection for advantageous linkage disequilibrium influences chromosome evolution. By comparing the genomes of zebrafish (Danio rerio) and medaka (Oryzias latipes), a teleost with a conserved ancestral karyotype, we find possible examples of chromosomal aggregation of CC-DEGs during evolution of the zebrafish lineage. The DAR metric provides a solid foundation for addressing the eQTL issue in new and existing datasets because it relies solely on RNA-sequencing data.

Publication types

  • Preprint