Measuring the quality of scientific references in Wikipedia: an analysis of more than 115M citations to over 800 000 scientific articles

FEBS J. 2021 Jul;288(14):4242-4248. doi: 10.1111/febs.15608. Epub 2020 Nov 19.

Abstract

Wikipedia is a widely used online reference work which cites hundreds of thousands of scientific articles across its entries. The quality of these citations has not been previously measured, and such measurements have a bearing on the reliability and quality of the scientific portions of this reference work. Using a novel technique, a massive database of qualitatively described citations, and machine learning algorithms, we analyzed 1 923 575 Wikipedia articles which cited a total of 824 298 scientific articles in our database and found that most scientific articles cited by Wikipedia articles are uncited or untested by subsequent studies, and the remainder show a wide variability in contradicting or supporting evidence. Additionally, we analyzed 51 804 643 scientific articles from journals indexed in the Web of Science and found that similarly most were uncited or untested by subsequent studies, while the remainder show a wide variability in contradicting or supporting evidence.

Keywords: Wikipedia; bibliometrics; citations; replication.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases, Factual*
  • Encyclopedias as Topic*
  • Humans
  • Internet / standards*
  • Journal Impact Factor
  • Periodicals as Topic / standards*
  • Periodicals as Topic / statistics & numerical data*
  • Reproducibility of Results