The comparison data forest: A new comparison data approach to determine the number of factors in exploratory factor analysis

Behav Res Methods. 2024 Mar;56(3):1838-1851. doi: 10.3758/s13428-023-02122-4. Epub 2023 Jun 15.

Abstract

Developing psychological assessment instruments often involves exploratory factor analyses, during which one must determine the number of factors to retain. Several factor-retention criteria have emerged that can infer this number from empirical data. Most recently, simulation-based procedures like the comparison data approach have shown the most accurate estimation of dimensionality. The factor forest, an approach combining extensive data simulation and machine learning modeling, showed even higher accuracy across various common data conditions. Because this approach is very computationally costly, we combine the factor forest and the comparison data approach to present the comparison data forest. In an evaluation study, we compared this new method with the common comparison data approach and identified optimal parameter settings for both methods given various data conditions. The new comparison data forest approach achieved slightly higher overall accuracy, though there were some important differences under certain data conditions. The CD approach tended to underfactor and the CDF tended to overfactor, and their results were also complementary in that for the 81.7% of instances when they identified the same number of factors, these results were correct 96.6% of the time.

Keywords: Comparison data; Exploratory factor analysis; Factor forest; Factor retention; Machine learning; Number of factors.

MeSH terms

  • Computer Simulation
  • Factor Analysis, Statistical
  • Humans
  • Machine Learning*