Epigenomic annotation-based interpretation of genomic data: from enrichment analysis to machine learning

Bioinformatics. 2017 Oct 15;33(20):3323-3330. doi: 10.1093/bioinformatics/btx414.

Abstract

Motivation: One of the goals of functional genomics is to understand the regulatory implications of experimentally obtained genomic regions of interest (ROIs). Most sequencing technologies now generate ROIs distributed across the whole genome. The interpretation of these genome-wide ROIs represents a challenge as the majority of them lie outside of functionally well-defined protein coding regions. Recent efforts by the members of the International Human Epigenome Consortium have generated volumes of functional/regulatory data (reference epigenomic datasets), effectively annotating the genome with epigenomic properties. Consequently, a wide variety of computational tools has been developed utilizing these epigenomic datasets for the interpretation of genomic data.

Results: The purpose of this review is to provide a structured overview of practical solutions for the interpretation of ROIs with the help of epigenomic data. Starting with epigenomic enrichment analysis, we discuss leading tools and machine learning methods utilizing epigenomic and 3D genome structure data. The hierarchy of tools and methods reviewed here presents a practical guide for the interpretation of genome-wide ROIs within an epigenomic context.

Contact: mikhail.dozmorov@vcuhealth.org.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Comparative Study

MeSH terms

  • Epigenomics / methods*
  • Guidelines as Topic
  • Humans
  • Machine Learning*
  • Molecular Sequence Annotation / methods*
  • Software*