Regulatory variants: from detection to predicting impact

Brief Bioinform. 2019 Sep 27;20(5):1639-1654. doi: 10.1093/bib/bby039.

Abstract

Variants within non-coding genomic regions can greatly affect disease. In recent years, increasing focus has been given to these variants, and how they can alter regulatory elements, such as enhancers, transcription factor binding sites and DNA methylation regions. Such variants can be considered regulatory variants. Concurrently, much effort has been put into establishing international consortia to undertake large projects aimed at discovering regulatory elements in different tissues, cell lines and organisms, and probing the effects of genetic variants on regulation by measuring gene expression. Here, we describe methods and techniques for discovering disease-associated non-coding variants using sequencing technologies. We then explain the computational procedures that can be used for annotating these variants using the information from the aforementioned projects, and prediction of their putative effects, including potential pathogenicity, based on rule-based and machine learning approaches. We provide the details of techniques to validate these predictions, by mapping chromatin-chromatin and chromatin-protein interactions, and introduce Clustered Regularly Interspaced Short Palindromic Repeats-Associated Protein 9 (CRISPR-Cas9) technology, which has already been used in this field and is likely to have a big impact on its future evolution. We also give examples of regulatory variants associated with multiple complex diseases. This review is aimed at bioinformaticians interested in the characterization of regulatory variants, molecular biologists and geneticists interested in understanding more about the nature and potential role of such variants from a functional point of views, and clinicians who may wish to learn about variants in non-coding genomic regions associated with a given disease and find out what to do next to uncover how they impact on the underlying mechanisms.

Keywords: GWAS; complex diseases; non-coding DNA; regulatory variants; variant analysis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Chromatin / metabolism
  • Clustered Regularly Interspaced Short Palindromic Repeats*
  • Genome, Human
  • Humans
  • Machine Learning
  • Protein Binding
  • Regulatory Sequences, Nucleic Acid*

Substances

  • Chromatin