Illusions of objectivity and a recommendation for reporting data mining results

Eur J Clin Pharmacol. 2007 May;63(5):517-21. doi: 10.1007/s00228-007-0279-3. Epub 2007 Mar 16.

Abstract

Objective: Data mining algorithms (DMAs) are being applied to spontaneous reporting system (SRS) databases in the hope of obtaining timely insights into post-licensure safety data. Some DMAs have been characterized as "objective" screening tools. However, there are numerous available modifiable configuration parameters to choose from, including choice of vendor, that may affect results. Our objective is to compare the data mining results on pre-selected drug-event combinations (DECs) between two commonly used software programs using similar protocols.

Methods: Two DMAs, using three thresholds, were retrospectively applied to the USFDA safety database through Q2 2005 to a set of eight pre-selected DECs.

Results: Differences between the two vendors were found for the number of cases associated with a signal of disproportionate reporting (SDR), first year of SDRs, and the magnitude of the SDR scores for the selected DECs. These were deemed to be potentially significant for 45.8% (11/24) of the data points.

Conclusion: The observed differences between vendors could partially be explained by their differing methods of data cleaning and transformation as well as by the specific features of individual algorithms. The choices of vendors and available data mining configurations maximize the exploratory capacity of data mining, but they also raise questions about the claimed objectivity of data mining results and can make data mining exercises susceptible to confirmation bias given the exploratory nature of data mining in pharmacovigilance. When reporting results, the vendor and all data mining configuration details should be specified.

Publication types

  • Comparative Study

MeSH terms

  • Adverse Drug Reaction Reporting Systems / statistics & numerical data*
  • Algorithms*
  • Commerce
  • Data Interpretation, Statistical
  • Databases, Factual
  • Drug-Related Side Effects and Adverse Reactions*
  • Humans
  • Product Surveillance, Postmarketing / methods*
  • Product Surveillance, Postmarketing / statistics & numerical data
  • Reproducibility of Results
  • Retrospective Studies
  • Software
  • United States
  • United States Food and Drug Administration