Comparison of four heterogeneity measures for meta-analysis

J Eval Clin Pract. 2020 Feb;26(1):376-384. doi: 10.1111/jep.13159. Epub 2019 Jun 24.

Abstract

Rationale, aims, and objectives: Heterogeneity is a critical issue in meta-analysis, because it implies the appropriateness of combining the collected studies and impacts the reliability of the synthesized results. The Q test is a traditional method to assess heterogeneity; however, because it does not have an intuitive interpretation for clinicians and often has low statistical power, many meta-analysts alter to use some measures, such as the I2 statistic, to quantify the extent of heterogeneity. This article aims at providing a summary of available tools to assess heterogeneity and comparing their performance.

Methods: We reviewed four heterogeneity measures (I2 , R ̂ I , R ̂ M , and R ̂ b ) and illustrated how they could be treated as test statistics like the Q statistic. These measures were compared with respect to statistical power based on simulations driven by three real-data examples. The pairwise agreement among the four measures was also evaluated using Cohen's κ coefficient.

Results: Generally, R ̂ I was slightly more powerful than the Q test, while its type I error rate might be slightly inflated. The power of I2 was fairly close to that of Q. The R ̂ M and R ̂ b statistics might have low powers in some cases. Because the differences between the powers of I2 , R ̂ I , and Q were often tiny, meta-analysts might not expect I2 and R ̂ I to yield significant heterogeneity if the Q test failed to do so. In addition, I2 and R ̂ I had fairly good agreement based on the simulated meta-analyses, but all other pairs of heterogeneity measures generally had poor agreement.

Conclusion: The I2 and R ̂ I statistics are recommended for measuring heterogeneity. Meta-analysts should use the heterogeneity measures as descriptive statistics which have intuitive interpretations from the clinical perspective, instead of determining the significance of heterogeneity simply based on their magnitudes.

Keywords: I2 statistic; heterogeneity; meta-analysis; statistical power.

Publication types

  • Review

MeSH terms

  • Humans
  • Meta-Analysis as Topic
  • Reproducibility of Results*