The revised Cochrane risk of bias tool for randomized trials (RoB 2) showed low interrater reliability and challenges in its application

J Clin Epidemiol. 2020 Oct:126:37-44. doi: 10.1016/j.jclinepi.2020.06.015. Epub 2020 Jun 18.

Abstract

Objective: The objective of the study is to assess the interrater reliability (IRR) and usability of the revised Cochrane risk of bias tool for randomized trials (RoB 2).

Study design and setting: This is a cross-sectional study. Four raters independently applied RoB 2 on the primary outcome of a random sample of individually randomized parallel-group trials (randomized controlled trials (RCTs)). We calculated the Fleiss' kappa for multiple raters, the time needed to complete the tool, and discussed the application of RoB 2 to identify difficulties and reasons for disagreement.

Results: A total of 70 outcomes from 70 RCTs were included. IRR was slight for overall judgment (IRR 0.16, 95% confidence interval (CI) 0.08-0.24); individual domain analysis gave IRR as moderate for "randomization process" (IRR 0.45, 95% CI 0.37-0.53), slight for "deviations from intended intervention" for RCTs assessing the effect of the assignment to an intervention (IRR 0.04, 95% CI -0.06 to 0.14), fair for those assessing the effect of adhering (IRR 0.21, 95% CI 0.11-0.31), and fair for the other domains, ranging from 0.22 (95% CI 0.14-0.30) for "missing outcome data" to 0.30 (95% CI 0.22-0.38) for "selection of reported results". Mean time to apply the tool was 28 minutes (standard deviation 13.4) per study outcome. The main difficulties were due to poor knowledge of the subject matter of primary studies, new terminology, different approaches for some domains compared with the previous tool, and way of formulating signaling questions.

Conclusions: RoB 2 is a detailed and comprehensive tool but difficult and demanding, even for raters with substantial expertise in systematic reviews. Calibration exercises and intensive training are needed before its application, to improve reliability.

Keywords: Interrater reliability; Randomized controlled trials; Risk of bias; RoB 2; Systematic reviews.

Publication types

  • Comparative Study

MeSH terms

  • Bias
  • Cross-Sectional Studies
  • Data Analysis
  • Data Collection / methods*
  • Humans
  • Judgment / physiology*
  • Knowledge
  • Outcome Assessment, Health Care
  • Randomized Controlled Trials as Topic
  • Reproducibility of Results
  • Research Design
  • Research Personnel / statistics & numerical data*
  • Research Personnel / trends
  • Risk