Statistical simulations show that scientists need not increase overall sample size by default when including both sexes in in vivo studies

Benjamin Phillips; Timo N Haschler; Natasha A Karp

doi:10.1371/journal.pbio.3002129

Statistical simulations show that scientists need not increase overall sample size by default when including both sexes in in vivo studies

PLoS Biol. 2023 Jun 8;21(6):e3002129. doi: 10.1371/journal.pbio.3002129. eCollection 2023 Jun.

Authors

Benjamin Phillips¹, Timo N Haschler², Natasha A Karp¹

Affiliations

¹ Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, United Kingdom.
² Bioscience Renal, Research and Early Development Cardiovascular, Renal and Metabolism, Biopharmaceutical R&D, AstraZeneca, Cambridge, United Kingdom.

Abstract

In recent years, there has been a strong drive to improve the inclusion of animals of both sexes in the design of in vivo research studies, driven by a need to increase sex representation in fundamental biology and drug development. This has resulted in inclusion mandates by funding bodies and journals, alongside numerous published manuscripts highlighting the issue and providing guidance to scientists. However, progress is slow and barriers to the routine use of both sexes remain. A frequent, major concern is the perceived need for a higher overall sample size to achieve an equivalent level of statistical power, which would result in an increased ethical and resource burden. This perception arises from either the belief that sex inclusion will increase variability in the data (either through a baseline difference or a treatment effect that depends on sex), thus reducing the sensitivity of statistical tests, or from misapprehensions about the correct way to analyse the data, including disaggregation or pooling by sex. Here, we conduct an in-depth examination of the consequences of including both sexes on statistical power. We performed simulations by constructing artificial datasets that encompass a range of outcomes that may occur in studies studying a treatment effect in the context of both sexes. This includes both baseline sex differences and situations in which the size of the treatment effect depends on sex in both the same and opposite directions. The data were then analysed using either a factorial analysis approach, which is appropriate for the design, or a t test approach following pooling or disaggregation of the data, which are common but erroneous strategies. The results demonstrate that there is no loss of power to detect treatment effects when splitting the sample size across sexes in most scenarios, providing that the data are analysed using an appropriate factorial analysis method (e.g., two-way ANOVA). In the rare situations where power is lost, the benefit of understanding the role of sex outweighs the power considerations. Additionally, use of the inappropriate analysis pipelines results in a loss of statistical power. Therefore, we recommend analysing data collected from both sexes using factorial analysis and splitting the sample size across male and female mice as a standard strategy.

Copyright: © 2023 Phillips et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Analysis of Variance
Animals
Female
Male
Mice
Research Design*
Sample Size
Sex Characteristics*

Grants and funding

The authors received no specific funding for this work.