Sample size planning for complex study designs: A tutorial for the mlpwr package

Behav Res Methods. 2023 Nov 29. doi: 10.3758/s13428-023-02269-0. Online ahead of print.

Abstract

A common challenge in designing empirical studies is determining an appropriate sample size. When more complex models are used, estimates of power can only be obtained using Monte Carlo simulations. In this tutorial, we introduce the R package mlpwr to perform simulation-based power analysis based on surrogate modeling. Surrogate modeling is a powerful tool in guiding the search for study design parameters that imply a desired power or meet a cost threshold (e.g., in terms of monetary cost). mlpwr can be used to search for the optimal allocation when there are multiple design parameters, e.g., when balancing the number of participants and the number of groups in multilevel modeling. At the same time, the approach can take into account the cost of each design parameter, and aims to find a cost-efficient design. We introduce the basic functionality of the package, which can be applied to a wide range of statistical models and study designs. Additionally, we provide two examples based on empirical studies for illustration: one for sample size planning when using an item response theory model, and one for assigning the number of participants and the number of countries for a study using multilevel modeling.

Keywords: Machine learning; Power analysis; Sample size; Simulation.