Interpretation of omics data analyses

J Hum Genet. 2021 Jan;66(1):93-102. doi: 10.1038/s10038-020-0763-5. Epub 2020 May 8.

Abstract

Omics studies attempt to extract meaningful messages from large-scale and high-dimensional data sets by treating the data sets as a whole. The concept of treating data sets as a whole is important in every step of the data-handling procedures: the pre-processing step of data records, the step of statistical analyses and machine learning, translation of the outputs into human natural perceptions, and acceptance of the messages with uncertainty. In the pre-processing, the method by which to control the data quality and batch effects are discussed. For the main analyses, the approaches are divided into two types and their basic concepts are discussed. The first type is the evaluation of many items individually, followed by interpretation of individual items in the context of multiple testing and combination. The second type is the extraction of fewer important aspects from the whole data records. The outputs of the main analyses are translated into natural languages with techniques, such as annotation and ontology. The other technique for making the outputs perceptible is visualization. At the end of this review, one of the most important issues in the interpretation of omics data analyses is discussed. Omics studies have a large amount of information in their data sets, and every approach reveals only a very restricted aspect of the whole data sets. The understandable messages from these studies have unavoidable uncertainty.

Publication types

  • Review

MeSH terms

  • Data Interpretation, Statistical
  • Epigenomics / methods
  • Epigenomics / standards
  • Epigenomics / statistics & numerical data*
  • Gas Chromatography-Mass Spectrometry / methods
  • Gas Chromatography-Mass Spectrometry / standards
  • Gas Chromatography-Mass Spectrometry / statistics & numerical data
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / standards
  • Gene Expression Profiling / statistics & numerical data*
  • Genomics / methods
  • Genomics / standards
  • Genomics / statistics & numerical data*
  • High-Throughput Nucleotide Sequencing / methods
  • High-Throughput Nucleotide Sequencing / standards
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Metabolomics / methods
  • Metabolomics / standards
  • Metabolomics / statistics & numerical data*
  • Proteomics / methods
  • Proteomics / standards
  • Proteomics / statistics & numerical data*
  • Quality Control