Missing-Values Imputation Algorithms for Microarray Gene Expression Data

Methods Mol Biol. 2019:1986:255-266. doi: 10.1007/978-1-4939-9442-7_12.

Abstract

In gene expression studies, missing values are a common problem with important consequences for the interpretation of the final data (Satija et al., Nat Biotechnol 33(5):495, 2015). Numerous bioinformatics examination tools are used for cancer prediction, including the data set matrix (Bailey et al., Cell 173(2):371-385, 2018); thus, it is necessary to resolve the problem of missing-values imputation. This chapter presents a review of the research on missing-values imputation approaches for gene expression data. By using local and global correlation of the data, we were able to focus mostly on the differences between the algorithms. We classified the algorithms as global, hybrid, local, or knowledge-based techniques. Additionally, this chapter presents suitable assessments of the different approaches. The purpose of this review is to focus on developments in the current techniques for scientists rather than applying different or newly developed algorithms with identical functional goals. The aim was to adapt the algorithms to the characteristics of the data.

Keywords: Cancer Informatics; Computational intelligence; Gene expression data; Microarray; Missing-values imputation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Databases, Genetic*
  • Gene Expression Profiling*
  • Gene Ontology
  • Oligonucleotide Array Sequence Analysis / methods*
  • Reproducibility of Results