AN APPROACH TO USING DATA MINING TO SUPPORT EARLY IDENTIFICATION OF ACROMEGALY

Endocr Pract. 2017 Apr 2;23(4):422-431. doi: 10.4158/EP161575.OR. Epub 2017 Jan 17.

Abstract

Objective: Data mining using insurance claims presents an opportunity to incorporate new analytic techniques in identifying rare conditions. This study aims to identify dyads of clinical conditions associated with acromegaly that may, with further validation and testing, be used to initially identify and diagnose this rare disease more accurately and efficiently.

Methods: This case-control study used two claims databases to identify acromegaly patients (cases) (International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM]: 253.0) from 2008-2013. Each case was assigned two nonacromegaly controls (same age, gender, and region). Matched patients were randomly split into development and validation datasets. With expert clinician input, we isolated common associated conditions using ICD-9-CM codes. We identified all 2-way combinations of these conditions (dyads) and calculated the rate and risk relative (RR) to controls. Dyads meeting certain criteria (case rate ≥5% [or ≥1% if RR ≥5] or observed RR > expected) were replicated in the validation dataset to confirm results.

Results: We identified 3,731 cases and 7,462 controls: mean age 41.8 (SD, 16.1) years, 51.8% female. A total of 32 and 38 dyads, reduced from 630, met study criteria. Among replicated dyads, case rates varied -15.9% (hypertension and metabolic disorder) to 0.6% (arthritis and menstrual abnormalities). The highest RRs (e.g., valvular insufficiency and colon polyps [RR, 13.5; rate, 0.7%]) also exceeded expected values. Replication showed similar RR direction and size.

Conclusion: This novel analytic approach revealed several dyads that were significantly associated with an acromegaly diagnosis. Presence of high-risk condition pairs, if verified by a detailed data source (e.g., medical charts), may be incorporated into screening tools or serve as potential markers for physicians to consider an acromegaly diagnosis.

Abbreviations: ICD-9-CM = International Classification of Diseases, Ninth Revision, Clinical Modification ID = identification RR = relative risk.

Publication types

  • Validation Study

MeSH terms

  • Acromegaly / diagnosis*
  • Acromegaly / epidemiology
  • Adult
  • Case-Control Studies
  • Data Mining / statistics & numerical data*
  • Databases, Factual / statistics & numerical data
  • Early Diagnosis
  • Female
  • Humans
  • International Classification of Diseases
  • Male
  • Middle Aged
  • Retrospective Studies
  • Risk