Stability of INFIT and OUTFIT Compared to Simulated Estimates in Applied Setting

J Appl Meas. 2017;18(4):383-392.

Abstract

Residual-based fit statistics are commonly used as an indication of the extent to which the item response data fit the Rash model. Fit statistic estimates are influenced by sample size and rules-of thumb estimates may result in incorrect conclusions about the extent to which the model fits the data. Estimates obtained in this analysis were compared to 250 simulated data sets to examine the stability of the estimates. All INFIT estimates were within the rule-of-thumb range of 0.7 to 1.3. However, only 82% of the INFIT estimates fell within the 2.5th and 97.5th percentile of the simulated item's INFIT distributions using this 95% confidence-like interval. This is a 18 percentage point difference in items that were classified as acceptable. Fourty-eight percent of OUTFIT estimates fell within the 0.7 to 1.3 rule- of-thumb range. Whereas 34% of OUTFIT estimates fell within the 2.5th and 97.5th percentile of the simulated item's OUTFIT distributions. This is a 13 percentage point difference in items that were classified as acceptable. When using the rule-of- thumb ranges for fit estimates the magnitude of misfit was smaller than with the 95% confidence interval of the simulated distribution. The findings indicate that the use of confidence intervals as critical values for fit statistics leads to different model data fit conclusions than traditional rule of thumb critical values.

MeSH terms

  • Adult
  • Aged
  • Clinical Competence / statistics & numerical data*
  • Data Interpretation, Statistical
  • Educational Measurement / methods*
  • Educational Measurement / statistics & numerical data
  • Female
  • Humans
  • Male
  • Middle Aged
  • Models, Statistical*
  • Psychometrics / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Surveys and Questionnaires*