minotaur: A platform for the analysis and visualization of multivariate results from genome scans with R Shiny

Mol Ecol Resour. 2017 Jan;17(1):33-43. doi: 10.1111/1755-0998.12579. Epub 2016 Aug 31.

Abstract

Genome scans are widely used to identify 'outliers' in genomic data: loci with different patterns compared with the rest of the genome due to the action of selection or other nonadaptive forces of evolution. These genomic data sets are often high dimensional, with complex correlation structures among variables, making it a challenge to identify outliers in a robust way. The Mahalanobis distance has been widely used, but has the major limitation of assuming that data follow a simple parametric distribution. Here, we develop three new metrics that can be used to identify outliers in multivariate space, while making no strong assumptions about the distribution of the data. These metrics are implemented in the R package minotaur, which also includes an interactive web-based application for visualizing outliers in high-dimensional data sets. We illustrate how these metrics can be used to identify outliers from simulated genetic data and discuss some of the limitations they may face in application.

Keywords: Mahalanobis; genomic scans; kernel density.

MeSH terms

  • Biostatistics / methods*
  • Computational Biology / methods*
  • Genetics, Population / methods*
  • Genomics / methods*
  • Internet
  • Selection, Genetic*
  • Software*