High-Dimensional Profiling for Computational Diagnosis

Methods Mol Biol. 2017:1526:205-229. doi: 10.1007/978-1-4939-6613-4_12.

Abstract

New technologies allow for high-dimensional profiling of patients. For instance, genome-wide gene expression analysis in tumors or in blood is feasible with microarrays, if all transcripts are known, or even without this restriction using high-throughput RNA sequencing. Other technologies like NMR finger printing allow for high-dimensional profiling of metabolites in blood or urine. Such technologies for high-dimensional patient profiling represent novel possibilities for molecular diagnostics. In clinical profiling studies, researchers aim to predict disease type, survival, or treatment response for new patients using high-dimensional profiles. In this process, they encounter a series of obstacles and pitfalls. We review fundamental issues from machine learning and recommend a procedure for the computational aspects of a clinical profiling study.

Keywords: Feature selection; Gene expression profiles; Metabolite analysis; Microarrays; Model assessment; NMR finger printing; RNA sequencing; Statistical classification; Supervised machine learning.

MeSH terms

  • Animals
  • Computational Biology / methods*
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Machine Learning
  • Oligonucleotide Array Sequence Analysis / methods