Methods for Using Race and Ethnicity in Prediction Models for Lung Cancer Screening Eligibility

Rebecca Landy; Isabel Gomez; Tanner J Caverly; Kensaku Kawamoto; M Patricia Rivera; Hilary A Robbins; Corey D Young; Anil K Chaturvedi; Li C Cheung; Hormuzd A Katki

doi:10.1001/jamanetworkopen.2023.31155

Methods for Using Race and Ethnicity in Prediction Models for Lung Cancer Screening Eligibility

JAMA Netw Open. 2023 Sep 5;6(9):e2331155. doi: 10.1001/jamanetworkopen.2023.31155.

Authors

Affiliations

¹ Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland.
² Biostatistics Department, University of Michigan, Ann Arbor.
³ Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor.
⁴ Department of Biomedical Informatics, University of Utah, Salt Lake City.
⁵ Division of Pulmonary and Critical Care Medicine and Wilmot Cancer Institute, University of Rochester, Rochester, New York.
⁶ Genomic Epidemiology Branch, International Agency for Research on Cancer, Lyon, France.
⁷ Department of Microbiology, Biochemistry and Immunology, Morehouse School of Medicine, Atlanta, Georgia.

Abstract

Importance: Using race and ethnicity in clinical prediction models can reduce or inadvertently increase racial and ethnic disparities in medical decisions.

Objective: To compare eligibility for lung cancer screening in a contemporary representative US population by refitting the life-years gained from screening-computed tomography (LYFS-CT) model to exclude race and ethnicity vs a counterfactual eligibility approach that recalculates life expectancy for racial and ethnic minority individuals using the same covariates but substitutes White race and uses the higher predicted life expectancy, ensuring that historically underserved groups are not penalized.

Design, setting, and participants: The 2 submodels composing LYFS-CT NoRace were refit and externally validated without race and ethnicity: the lung cancer death submodel in participants of a large clinical trial (recruited 1993-2001; followed up until December 31, 2009) who ever smoked (n = 39 180) and the all-cause mortality submodel in the National Health Interview Survey (NHIS) 1997-2001 participants aged 40 to 80 years who ever smoked (n = 74 842, followed up until December 31, 2006). Screening eligibility was examined in NHIS 2015-2018 participants aged 50 to 80 years who ever smoked. Data were analyzed from June 2021 to September 2022.

Exposure: Including and removing race and ethnicity (African American, Asian American, Hispanic American, White) in each LYFS-CT submodel.

Main outcomes and measures: By race and ethnicity: calibration of the LYFS-CT NoRace model and the counterfactual approach (ratio of expected to observed [E/O] outcomes), US individuals eligible for screening, predicted days of life gained from screening by LYFS-CT.

Results: The NHIS 2015-2018 included 25 601 individuals aged 50 to 80 years who ever smoked (2769 African American, 649 Asian American, 1855 Hispanic American, and 20 328 White individuals). Removing race and ethnicity from the submodels underestimated lung cancer death risk (expected/observed [E/O], 0.72; 95% CI, 0.52-1.00) and all-cause mortality (E/O, 0.90; 95% CI, 0.86-0.94) in African American individuals. It also overestimated mortality in Hispanic American (E/O, 1.08, 95% CI, 1.00-1.16) and Asian American individuals (E/O, 1.14, 95% CI, 1.01-1.30). Consequently, the LYFS-CT NoRace model increased Hispanic American and Asian American eligibility by 108% and 73%, respectively, while reducing African American eligibility by 39%. Using LYFS-CT with the counterfactual all-cause mortality model better maintained calibration across groups and increased African American eligibility by 13% without reducing eligibility for Hispanic American and Asian American individuals.

Conclusions and relevance: In this study, removing race and ethnicity miscalibrated LYFS-CT submodels and substantially reduced African American eligibility for lung cancer screening. Under counterfactual eligibility, no one became ineligible, and African American eligibility increased, demonstrating the potential for maintaining model accuracy while reducing disparities.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.
Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural
Research Support, N.I.H., Intramural

MeSH terms

Adult
Aged
Aged, 80 and over
Asian
Black or African American
Early Detection of Cancer* / statistics & numerical data
Eligibility Determination* / statistics & numerical data
Ethnicity
Hispanic or Latino
Humans
Life Expectancy
Lung Neoplasms* / diagnosis
Lung Neoplasms* / epidemiology
Lung Neoplasms* / ethnology
Mass Screening* / statistics & numerical data
Middle Aged
Minority Groups
Models, Statistical
Race Factors
Risk Assessment
White

Abstract

Publication types

MeSH terms

Grants and funding