Addressing the identification problem in age-period-cohort analysis: a tutorial on the use of partial least squares and principal components analysis

Epidemiology. 2012 Jul;23(4):583-93. doi: 10.1097/EDE.0b013e31824d57a9.

Abstract

In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Adult
  • Age Factors
  • Aged
  • Aged, 80 and over
  • Body Height
  • Carcinoma, Hepatocellular / mortality
  • Cohort Studies*
  • Data Interpretation, Statistical*
  • Effect Modifier, Epidemiologic*
  • Humans
  • Least-Squares Analysis*
  • Liver Neoplasms / mortality
  • Male
  • Middle Aged
  • Principal Component Analysis*
  • Taiwan / epidemiology