Check your outliers! An introduction to identifying statistical outliers in R with easystats

Behav Res Methods. 2024 Mar 25. doi: 10.3758/s13428-024-02356-w. Online ahead of print.

Abstract

Beyond the challenge of keeping up to date with current best practices regarding the diagnosis and treatment of outliers, an additional difficulty arises concerning the mathematical implementation of the recommended methods. Here, we provide an overview of current recommendations and best practices and demonstrate how they can easily and conveniently be implemented in the R statistical computing software, using the {performance} package of the easystats ecosystem. We cover univariate, multivariate, and model-based statistical outlier detection methods, their recommended threshold, standard output, and plotting methods. We conclude by reviewing the different theoretical types of outliers, whether to exclude or winsorize them, and the importance of transparency. A preprint of this paper is available at: 10.31234/osf.io/bu6nt.

Keywords: Easystats; Multivariate outliers; R; Robust detection methods; Univariate outliers.