A selective review of robust variable selection with applications in bioinformatics

Brief Bioinform. 2015 Sep;16(5):873-83. doi: 10.1093/bib/bbu046. Epub 2014 Dec 5.

Abstract

A drastic amount of data have been and are being generated in bioinformatics studies. In the analysis of such data, the standard modeling approaches can be challenged by the heavy-tailed errors and outliers in response variables, the contamination in predictors (which may be caused by, for instance, technical problems in microarray gene expression studies), model mis-specification and others. Robust methods are needed to tackle these challenges. When there are a large number of predictors, variable selection can be as important as estimation. As a generic variable selection and regularization tool, penalization has been extensively adopted. In this article, we provide a selective review of robust penalized variable selection approaches especially designed for high-dimensional data from bioinformatics and biomedical studies. We discuss the robust loss functions, penalty functions and computational algorithms. The theoretical properties and implementation are also briefly examined. Application examples of the robust penalization approaches in representative bioinformatics and biomedical studies are also illustrated.

Keywords: bioinformatics study; penalization; robust methods; variable selection.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Computational Biology*
  • Models, Theoretical