Teaching Principal Components Using Correlations

Peter H Westfall; Andrea L Arias; Lawrence V Fulton

doi:10.1080/00273171.2017.1340824

Teaching Principal Components Using Correlations

Multivariate Behav Res. 2017 Sep-Oct;52(5):648-660. doi: 10.1080/00273171.2017.1340824. Epub 2017 Jul 17.

Authors

Peter H Westfall¹, Andrea L Arias^{2

3}, Lawrence V Fulton⁴

Affiliations

¹ a Area of Information Systems and Quantitative Sciences , Texas Tech University.
² b School of Industrial Engineering , Pontificia Universidad Católica de Valparaíso.
³ c Department of Industrial Engineering , Texas Tech University.
⁴ d Area of Health Organization Management , Texas Tech University.

PMID: 28715259
DOI: 10.1080/00273171.2017.1340824

Abstract

Introducing principal components (PCs) to students is difficult. First, the matrix algebra and mathematical maximization lemmas are daunting, especially for students in the social and behavioral sciences. Second, the standard motivation involving variance maximization subject to unit length constraint does not directly connect to the "variance explained" interpretation. Third, the unit length and uncorrelatedness constraints of the standard motivation do not allow re-scaling or oblique rotations, which are common in practice. Instead, we propose to motivate the subject in terms of optimizing (weighted) average proportions of variance explained in the original variables; this approach may be more intuitive, and hence easier to understand because it links directly to the familiar "R-squared" statistic. It also removes the need for unit length and uncorrelatedness constraints, provides a direct interpretation of "variance explained," and provides a direct answer to the question of whether to use covariance-based or correlation-based PCs. Furthermore, the presentation can be made without matrix algebra or optimization proofs. Modern tools from data science, including heat maps and text mining, provide further help in the interpretation and application of PCs; examples are given. Together, these techniques may be used to revise currently used methods for teaching and learning PCs in the behavioral sciences.

Keywords: Factor analysis; heat map; optimality; rotation; variance explained.

MeSH terms

Humans
Principal Component Analysis*
Teaching*