Conditional or unconditional logistic regression for frequency matched case-control design?

Fei Wan

doi:10.1002/sim.9313

Conditional or unconditional logistic regression for frequency matched case-control design?

Stat Med. 2022 Mar 15;41(6):1023-1041. doi: 10.1002/sim.9313. Epub 2022 Jan 23.

Author

Fei Wan¹

Affiliation

¹ Division of Public Health Sciences, Washington University in St. Louis, St. Louis, Missouri, USA.

PMID: 35067958
DOI: 10.1002/sim.9313

Abstract

Frequency matching is commonly used in epidemiological case control studies to balance the distributions of the matching factors between the case and control groups and to improve the efficiency of case-control designs. Applied researchers have held a common opinion that unconditional logistic regression should be used to analyze frequency matched designs and conditional logistic regression is unnecessary. However, the justification of this view is unclear. To compare the performances of ULR and CLR in terms of simplicity, unbiasedness, and efficiency in a more intuitive way, we viewed frequency matching from the perspective of weighted sampling and derived the outcome models describing how the exposure and matching factors are associated with the outcome in the matched data separately in two scenarios: (1) only categorical variables are used for matching; (2) continuous variables are categorized for matching. In either scenario the derived outcome model is a logit model with stratum-specific intercepts. Correctly specified unconditional logistic regression can be more efficient than conditional logistic regression, particularly when continuous matching factors are used, whereas conditional logistic regression is a more practical approach because it is less dependent on modeling choices.

Keywords: bias; case-control design; conditional logistic regression; frequency matching; unconditional logistic regression.

MeSH terms

Case-Control Studies
Humans
Logistic Models*