Comparing the reliability and validity of the SF-36 and SF-12 in measuring quality of life among adolescents in China: a large sample cross-sectional study

Health Qual Life Outcomes. 2020 Nov 9;18(1):360. doi: 10.1186/s12955-020-01605-8.

Abstract

Objective: We compare the reliability and validity of the Short Form 36 (version 1, SF-36) and the Short Form 12 (version 1, SF-12) in adolescence, the period of life when a child develops into an adult, i.e., the period from puberty to maturity terminating legally at the age of majority (10-19 years), thus supplying evidence for the selection of instruments measuring the quality of life (QOL) and decision-making processes of adolescents in China.

Methods: Stratified cluster random sampling was adopted according to geographical location, and the SF-36 was administered to assess QOL. The Pearson correlation coefficient was used to show correlation. Cronbach's alpha and construct reliability (CR) were used to evaluate the reliability of SF-36 and SF-12, while criterion validity and average variance extracted (AVE, convergence validity) were used to evaluate validity. Confirmatory factor analysis was used to calculate the load factors for the items of the SF-36 and SF-12, then to obtain the CR and AVE. The Semejima grade response model (logistic two-parameter module) in item response theory was used to estimate item discrimination, item difficulty, and item average information for the items of the SF-36 and SF-12.

Results: 19,428 samples were included in the study. The mean age of respondents was 14.78 years (SD = 1.77). Reliability of each domain of the SF-36 was better than for the corresponding domain of the SF-12. The domains of PF, RP, BP, and GH in SF-36 had good construct reliability (CR > 0.6). The criterion validities of some domains of the SF-36 were a little higher in some corresponding dimensions of the SF-12, except for PCS. The convergence validities of the SF-12 were higher than the SF-36 in PF, RP, BP, and PCS. The items of BP, SF, RP, and VT in the SF-12 had acceptable discrimination of items that were higher than in the SF-36. The items' average amounts of information on BP, VT, SF, RE, and MH in the SF-36 and SF-12 were poor.

Conclusion: Two component (PCS and MCS) measurements of the SF-12 appeared to perform at least as well as the SF-36 in cross-sectional settings in adolescence, but the reliability and validity of the 8 domains of the SF-36 were better than those of the SF-12. Some domains, for instance SF and BP, were not suitable for adolescents or need to be studied further.

Keywords: Average information; Chinese adolescents; Discrimination; Quality of life; Reliability; SF12; SF36; Validity.

Publication types

  • Comparative Study
  • Validation Study

MeSH terms

  • Adolescent
  • Child
  • China
  • Cross-Sectional Studies
  • Factor Analysis, Statistical
  • Female
  • Humans
  • Male
  • Quality of Life*
  • Reproducibility of Results
  • Surveys and Questionnaires / standards*