Developing Computerized Adaptive Testing for a National Health Professionals Exam: An Attempt from Psychometric Simulations

Perspect Med Educ. 2023 Oct 31;12(1):462-471. doi: 10.5334/pme.855. eCollection 2023.

Abstract

Introduction: The accurate assessment of health professionals' competence is critical for ensuring public health safety and quality of care. Computerized Adaptive Testing (CAT) based on the Item Response Theory (IRT) has the potential to improve measurement accuracy and reduce respondent burden. In this study, we conducted psychometric simulations to develop a CAT for evaluating the candidates' competence of health professionals.

Methods: The initial CAT item bank was sourced from the Standardized Competence Test for Clinical Medicine Undergraduates (SCTCMU), a nationwide summative test in China, consisting of 300 multiple-choice items. We randomly selected response data from 2000 Chinese clinical medicine undergraduates for analysis. Two types of analyses were performed: first, evaluating the psychometric properties of all items to meet the requirements of CAT; and second, conducting multiple CAT simulations using both simulated and real response data.

Results: The final CAT item bank consisted of 121 items, for which item parameters were calculated using a two-parameter logistic model (2PLM). The CAT simulations, based on both simulated and real data, revealed sufficient marginal reliability (coefficient of marginal reliability above 0.750) and criterion-related validity (Pearson's correlations between CAT scores and aggregate scores of the SCTCMU exceeding 0.850).

Discussion: In national-level medical education assessment, there is an increasing need for concise yet valid evaluations of candidates' competence of health professionals. The CAT developed in this study demonstrated satisfactory reliability and validity, offering a more efficient assessment of candidates' competence of health professionals. The psychometric properties of the CAT could lead to shorter test durations, reduced information loss, and a decreased testing burden for participants.

MeSH terms

  • Computerized Adaptive Testing*
  • Health Personnel*
  • Humans
  • Psychometrics
  • Reproducibility of Results
  • Students

Grants and funding

This work was supported by National Natural Science Foundation of China for Young Scholars under Grant 72104006, Peking University Health Science Center under Grant BMU2021YJ010, Key Laboratory of Digital Educational Publishing Technology and Standards and Digital Education Research Institute of the Peoples Education Press under Grant RJB0623001, and Peking University Health Science Center Medical Education Research Funding Project under Grant 2023YB24 and 2022YB41.