The full report of a Cochrane-funded study designed to compare the existing and revised Cochrane risk of bias tools for RCTs was finalised in June 2021 and has now been made available to the Cochrane community. The study was initiated to collect evidence about the amount of time and effort review authors need to invest to use the revised risk of bias tool for RCTs (RoB 2) compared with the existing study-based tool (hereafter referred to as RoB 1), and how review quality is impacted. The work, led by Bernd Richter and Bianca Hemmingsen from the Cochrane Metabolic and Endocrine Disorders Group, represents an important step in assessing the benefits RoB 2 holds for improving the reliability and quality of Cochrane Reviews, and whether there is a compromise with time taken to complete the assessments.
RoB 2 represents a change to the way bias is assessed and reported in Cochrane Reviews and was designed to guide authors through meaningful, results-based assessments. The revised tool requires a change in focus from studies to the data they contribute, and biases introduced into review analyses. However, following concerns about the usability of the revised tool and the time it might add to the review process, Cochrane set out plans to support the first author teams and review groups using it in a pilot. The aim was to observe any difficulties encountered throughout the review journey and build resources and functionality in our technology that would minimize the disruption of wider uptake. The comparison study ran alongside the pilot to compare RoB 2 with RoB 1 and collect empirical evidence within an index review.
The study was conducted using data from a Cochrane review on insulin analogues for type 1 diabetes mellitus, and the review sample for comparison was 26 RCTs that included parallel, crossover, non-inferiority, superiority, open-label and blinded designs. The review looked at eleven outcome measures including both dichotomous and continuous outcomes and objective and patient-reported endpoints. The team applied both tools to all outcomes according to a pre-specified protocol to compare key features, difficulties in use, and potential consequences for the results of the systematic review.
What were the main findings?
The study authors found mean assessment times using RoB 2 for all outcomes comparable to RoB 1 (approximately 30 minutes), despite experiencing a learning curve with RoB 2, and reported no major usability problems with either tool. Overall, the study found few difficulties with RoB 1 or RoB 2 in achieving agreement between raters and differences in consensus judgements between tools were not common. The biggest divergence between the tools was with subjective outcomes in open-label studies, where RoB 1 was more likely to penalize than RoB 2. RoB 1 also tended to lead to harsher risk of bias judgements as a result of the options available (high/low/unclear), whereas the ratings, algorithm, signalling questions and guidance built into the RoB 2 tool made it easier to work through complexity and context. The study also investigated the impact of clinical study reports (CSRs) for bias assessment with both tools and found them to be very valuable irrespective of the tool used, though the time taken to complete assessments was longer (45 to 90 minutes).
Differences were reported in the nature of discrepancies requiring consensus for RoB 1 and RoB 2. With RoB 1, disagreements were most common in the ‘performance bias’ and ‘detection bias’ domains for participant-reported outcome measures. These biases are teased out in a more granular way with RoB 2, and the authors noted that systems to capture outcome-based variation with RoB 1 can overlook issues (e.g. by grouping objective or subjective), or require post-hoc clinical judgement that might itself introduce bias. For RoB 2, ‘deviations from intended interventions’ (domain 2) and ‘selection of the reported result’ (domain 5) led to most discrepancies, particularly when trial protocols were not available.
The study’s conclusions highlight the benefits of RoB 2 in encouraging reviewers to think in terms of endpoints and results to improve the quality and relevance of risk of bias assessment in Cochrane reviews. The authors also make several recommendations to improve usability of RoB 2, such as adding elaborations or tweaking wording for some signalling questions, and creating processes within the Excel® tool to automatically populate study-based data that is not expected to vary across outcomes.
What does it mean for Cochrane?
The results of any comparison between the two tools may be expected to vary with knowledge of the topic area, methodological expertise, familiarity with the existing tool, and the type and number of outcomes within the review. Therefore, no one study can provide definitive results about their relative usability and impact. The authors of this study are both experienced methodologists and familar with the clinical area. They also applied both tools in line with MECIR to capture differences in bias across outcomes, which may have reduced the extent of difference between the tools, both in terms of time taken to conduct assessments and translation to impact on the analysis results. Interpretation of the study’s findings must be made in light of these factors, and though the efficiency difference may have been starker had RoB 1 been applied at study level, it follows that so too would reliability and usefulness of the assessments.
The authors also highlight that differences in the way the tools are structured and how reviewers choose to embed bias judgements in their analyses limit the usefulness of some comparisons, especially when it comes to demonstrating impact on analysis results. Nonetheless, it is an important addition to the real-world evidence base and allows Cochrane to understand how the tools vary and continue tailoring implementation resources to address common issues and errors. Studies like the one reported here also provide rich stories to streamline how the tool is implemented through technology to create efficiencies and make it easier for the benefits of bias assessment to be realised.
It is in light of these considerations that the expectations for RoB 2 have remained as they were set in November 2020 , which incorporate an element of flexibility in continuing to use RoB 1, providing it is applied in a way that allows for differences in bias across outcomes to be captured. The expectations accommodate the ongoing learning curve with RoB 2 and circumstances where, on the joint judgement of authors and CRG staff, adopting the new tool might not be feasible (e.g. for a large update). This does not preclude that purely study-based bias assessments should be a thing of the past and that training, planning, careful consideration and debate within the author team should underpin any meaningful bias assessment.
The Methods Team and Bias Methods Group welcome feedback about the usability of RoB 2 and any suggestions about how the Excel tool and the wording of signalling questions can be improved, and will continue to develop resources to support the author teams and editors.
Attributed to Kayleigh Kew, not on behalf of EMD