Validity of the Spanish-Language Patient Health Questionnaires 2 and 9: A Systematic Review and Meta-Analysis

JAMA Netw Open. 2023 Oct 2;6(10):e2336529. doi: 10.1001/jamanetworkopen.2023.36529.

Abstract

Importance: Reliable screening for major depressive disorder (MDD) relies on valid and accurate screening tools.

Objective: To examine the validity, accuracy, and reliability of the Spanish-language Patient Health Questionnaires 2 and 9 (PHQ-2 and PHQ-9) to screen for MDD.

Data sources: PubMed, Web of Science, Embase, and PsycINFO from data initiation through February 27, 2023.

Study selection: English- and Spanish-language studies evaluating the validity of the Spanish-language PHQ-2 or PHQ-9 in screening adults for MDD compared with a standardized clinical interview (gold standard). Search terms included PHQ-2, PHQ-9, depression, and Spanish.

Data extraction and synthesis: Two reviewers performed abstract and full-text reviews, data extraction, and quality assessment. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed. Random-effects meta-analyses of sensitivity, specificity, and area under the curve (AUC) were performed. Internal consistency was evaluated using Cronbach α and McDonald ψ.

Main outcomes and measures: Test accuracy and internal consistency. The PHQ-2 is composed of the first 2 questions of the PHQ-9 (targeting core depression symptoms of depressed mood and anhedonia; a score of 3 or higher (score range, 0-6) is generally considered a positive depression screen. If a patient screens positive with the PHQ-2, a follow-up assessment with the PHQ-9 and a clinical diagnostic evaluation are recommended. Once depression is diagnosed, a PHQ-9 score of 10 or higher (score range, 0-27) is often considered an acceptable threshold for treating depression.

Results: Ten cross-sectional studies involving 5164 Spanish-speaking adults (mean age range, 34.1-71.8 years) were included; most studies (n = 8) were in primary care settings. One study evaluated the PHQ-2, 7 evaluated the PHQ-9, and 2 evaluated both the PHQ-2 and PHQ-9. For the PHQ-2, optimal cutoff scores ranged from greater than or equal to 1 to greater than or equal to 2, with an overall pooled sensitivity of 0.89 (95% CI, 0.81-0.95), overall pooled specificity of 0.89 (95% CI, 0.81-0.95), and overall pooled AUC of 0.87 (95% CI, 0.83-0.90); Cronbach α was 0.71 to 0.75, and McDonald ψ was 0.71. For the PHQ-9, optimal cutoff scores ranged from greater than or equal to 5 to greater than or equal to 12, with an overall pooled sensitivity of 0.86 (95% CI, 0.82-0.90), overall pooled specificity of 0.80 (95% CI, 0.75-0.85), and overall pooled AUC of 0.88 (95% CI, 0.87-0.90); Cronbach α was 0.78 to 0.90, and McDonald ψ was 0.79 to 0.90. Four studies were considered to have low risk of bias; 6 studies had indeterminate risk of bias due to a lack of blinding information.

Conclusions and relevance: In this systematic review and meta-analysis, limited available evidence supported the use of the Spanish-language PHQ-2 and PHQ-9 in screening for MDD, but optimal cutoff scores varied greatly across studies, and few studies reported on blinding schemes. These results suggest that MDD should be considered in Spanish-speaking individuals with lower test scores. Given the widespread clinical use of the tools and the heterogeneity of existing evidence, further investigation is needed.

Publication types

  • Meta-Analysis
  • Systematic Review
  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Aged
  • Cross-Sectional Studies
  • Depressive Disorder, Major* / diagnosis
  • Humans
  • Language
  • Middle Aged
  • Patient Health Questionnaire*
  • Reproducibility of Results
  • Surveys and Questionnaires