Pooling data from fatality analysis reporting system (FARS) and generalized estimates system (GES) to explore the continuum of injury severity spectrum

Accid Anal Prev. 2015 Nov:84:112-27. doi: 10.1016/j.aap.2015.08.009. Epub 2015 Sep 3.

Abstract

Fatality Analysis Reporting System (FARS) and Generalized Estimates System (GES) data are most commonly used datasets to examine motor vehicle occupant injury severity in the United States (US). The FARS dataset focuses exclusively on fatal crashes, but provides detailed information on the continuum of fatality (a spectrum ranging from a death occurring within thirty days of the crash up to instantaneous death). While such data is beneficial for understanding fatal crashes, it inherently excludes crashes without fatalities. Hence, the exogenous factors identified as critical in contributing (or reducing) to fatality in the FARS data might possibly offer different effects on non-fatal crash severity levels when a truly random sample of crashes is considered. The GES data fills this gap by compiling data on a sample of roadway crashes involving all possible severity consequences providing a more representative sample of traffic crashes in the US. FARS data provides a continuous timeline of the fatal occurrences from the time to crash - as opposed to considering all fatalities to be the same. This allows an analysis of the survival time of victims before their death. The GES, on the other hand, does not offer such detailed information except identifying who died in the crash. The challenge in obtaining representative estimates for the crash population is the lack of readily available "appropriate" data that contains information available in both GES and FARS datasets. One way to address this issue is to replace the fatal crashes in the GES data with fatal crashes from FARS data thus augmenting the GES data sample with a very refined categorization of fatal crashes. The sample thus formed, if statistically valid, will provide us with a reasonable representation of the crash population. This paper focuses on developing a framework for pooling of data from FARS and GES data. The validation of the pooled sample against the original GES sample (unpooled sample) is carried out through two methods: (1) univariate sample comparison and (2) econometric model parameter estimate comparison. The validation exercise indicates that parameter estimates obtained using the pooled data model closely resemble the parameter estimates obtained using the unpooled data. After we confirm that the differences in model estimates obtained using the pooled and unpooled data are within an acceptable margin, we also simultaneously examine the whole spectrum of injury severity on an eleven point ordinal severity scale - no injury, minor injury, severe injury, incapacitating injury, and 7 refined categories of fatalities ranging from fatality after 30 days to instant death - using a nationally representative pooled dataset. The model estimates are augmented by conducting elasticity analysis to illustrate the applicability of the proposed framework.

Keywords: Data Pooling; Fatality; Fatality Analysis Reporting System (FARS); Generalized Estimates System (GES); Generalized ordered logit model.

MeSH terms

  • Accidents, Traffic / mortality*
  • Accidents, Traffic / statistics & numerical data*
  • Adult
  • Databases, Factual
  • Female
  • Humans
  • Injury Severity Score
  • Logistic Models
  • Male
  • Middle Aged
  • Mortality
  • Time Factors
  • United States