Tool for filtering PubMed search results by sample size

J Am Med Inform Assoc. 2018 Jul 1;25(7):774-779. doi: 10.1093/jamia/ocx155.

Abstract

Objective: The most used search engine for scientific literature, PubMed, provides tools to filter results by several fields. When searching for reports on clinical trials, sample size can be among the most important factors to consider. However, PubMed does not currently provide any means of filtering search results by sample size. Such a filtering tool would be useful in a variety of situations, including meta-analyses or state-of-the-art analyses to support experimental therapies. In this work, a tool was developed to filter articles identified by PubMed based on their reported sample sizes.

Materials and methods: A search engine was designed to send queries to PubMed, retrieve results, and compute estimates of reported sample sizes using a combination of syntactical and machine learning methods. The sample size search tool is publicly available for download at http://ihealth.uemc.es. Its accuracy was assessed against a manually annotated database of 750 random clinical trials returned by PubMed.

Results: Validation tests show that the sample size search tool is able to accurately (1) estimate sample size for 70% of abstracts and (2) classify 85% of abstracts into sample size quartiles.

Conclusions: The proposed tool was validated as useful for advanced PubMed searches of clinical trials when the user is interested in identifying trials of a given sample size.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Clinical Trials as Topic*
  • Information Storage and Retrieval / methods*
  • PubMed*
  • ROC Curve
  • Sample Size*
  • Search Engine
  • Software