Deduplicating records in systematic reviews: there are free, accurate automated ways to do so

Nathalia Sernizon Guimarães; Andrêa J F Ferreira; Rita de Cássia Ribeiro Silva; Adelzon Assis de Paula; Cinthia Soares Lisboa; Laio Magno; Maria Yury Ichiara; Maurício Lima Barreto

doi:10.1016/j.jclinepi.2022.10.009

Deduplicating records in systematic reviews: there are free, accurate automated ways to do so

J Clin Epidemiol. 2022 Dec:152:110-115. doi: 10.1016/j.jclinepi.2022.10.009. Epub 2022 Oct 12.

Authors

Nathalia Sernizon Guimarães¹, Andrêa J F Ferreira², Rita de Cássia Ribeiro Silva³, Adelzon Assis de Paula⁴, Cinthia Soares Lisboa⁵, Laio Magno⁶, Maria Yury Ichiara⁷, Maurício Lima Barreto⁸

Affiliations

¹ Institute of Collective Health. Federal University of Bahia, Salvador, Bahia, Brazil. Electronic address: nasernizon@gmail.com.
² Centre for Data and Knowledge Integration for Health (CIDACS), Oswaldo Cruz Foundation, Salvador, Bahia, Brazil; The Ubuntu Center on Racism, Global Movements, and Population Health Equity, Dornsife School of Public Health, Drexel University, Philadelphia, PA, USA.
³ Centre for Data and Knowledge Integration for Health (CIDACS), Oswaldo Cruz Foundation, Salvador, Bahia, Brazil; Department of Nutrition, School of Nutrition, Federal University of Bahia, Salvador, Bahia, Brazil.
⁴ Institute of Collective Health. Federal University of Bahia, Salvador, Bahia, Brazil.
⁵ Pos-graduation programme of Collective Health, State University of Feira de Santana, Feira de Santana, Bahia, Brazil.
⁶ Institute of Collective Health. Federal University of Bahia, Salvador, Bahia, Brazil; Department of Life Sciences, State University of Bahia, Salvador, Bahia, Brazil.
⁷ Centre for Data and Knowledge Integration for Health (CIDACS), Oswaldo Cruz Foundation, Salvador, Bahia, Brazil.
⁸ Institute of Collective Health. Federal University of Bahia, Salvador, Bahia, Brazil; Centre for Data and Knowledge Integration for Health (CIDACS), Oswaldo Cruz Foundation, Salvador, Bahia, Brazil.

PMID: 36241035
DOI: 10.1016/j.jclinepi.2022.10.009

Abstract

Objective: Here, we examined the accuracy measures of a set of automated deduplication tools to identify duplicate in the eligibility process of systematic reviews.

Study design and setting: A planned search strategy was carried out on seven electronic databases until May 31, 2021. Using manual search as the reference standard, we assessed sensibility, specificity, negative predictive value, and positive predictive value (PPV).

Results: Specificity ranged from 0.96 to 1.00. Rayyan, Mendeley, and Systematic Review Accelerator (SRA) presented high sensibility (0.98 [95% CI = 0.94-1.00]; 0.93 [95% CI = 0.88-0.97] and 0.90 [95% CI = 0.84-0.95], respectively), whereas EndNote X9 and Zotero had only fair sensitivity (0.73 [95% CI = 0.65-0.80] and 0.74 [95% CI = 0.66-0.81], respectively). Negative predictive value ranged from 0.99 to 1.00. Mendeley and SRA had good PPV (0.93 [95% CI = 0.88-0.97] and 0.99 [95% CI = 0.96-1.00], respectively). PPV was fair for EndNote X9 (0.61 [95% CI = 0.54-0.69]) and Zotero (0.62 [95% CI = 0.54-0.69]) and poor for Rayyan (0.41 [95% CI = 0.36-0.47]).

Conclusion: Choosing the most suitable tool depends on its interface's characteristics, the algorithm to identify and exclude duplicates, and the transparency of the process. Therefore, Rayyan, Mendeley, and SRA proved to be accurate enough for the systematic reviews' deduplication step.

Keywords: Accuracy; Deduplication; Epidemiological research; Libraries; Nutrition research methodologies; Systematic review.

MeSH terms

Algorithms*
Databases, Factual
Humans
Predictive Value of Tests
Reference Standards
Systematic Reviews as Topic