Machine learning to assist risk-of-bias assessments in systematic reviews

Int J Epidemiol. 2016 Feb;45(1):266-77. doi: 10.1093/ije/dyv306. Epub 2015 Dec 8.

Abstract

Background: Risk-of-bias assessments are now a standard component of systematic reviews. At present, reviewers need to manually identify relevant parts of research articles for a set of methodological elements that affect the risk of bias, in order to make a risk-of-bias judgement for each of these elements. We investigate the use of text mining methods to automate risk-of-bias assessments in systematic reviews. We aim to identify relevant sentences within the text of included articles, to rank articles by risk of bias and to reduce the number of risk-of-bias assessments that the reviewers need to perform by hand.

Methods: We use supervised machine learning to train two types of models, for each of the three risk-of-bias properties of sequence generation, allocation concealment and blinding. The first model predicts whether a sentence in a research article contains relevant information. The second model predicts a risk-of-bias value for each research article. We use logistic regression, where each independent variable is the frequency of a word in a sentence or article, respectively.

Results: We found that sentences can be successfully ranked by relevance with area under the receiver operating characteristic (ROC) curve (AUC) > 0.98. Articles can be ranked by risk of bias with AUC > 0.72. We estimate that more than 33% of articles can be assessed by just one reviewer, where two reviewers are normally required.

Conclusions: We show that text mining can be used to assist risk-of-bias assessments.

Keywords: Risk of bias; machine learning; systematic review; text mining.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias*
  • Data Mining / methods*
  • Datasets as Topic
  • Humans
  • Logistic Models
  • Machine Learning / statistics & numerical data*
  • ROC Curve
  • Review Literature as Topic*