Prior Design for Dependent Dirichlet Processes: An Application to Marathon Modeling

PLoS One. 2016 Jan 28;11(1):e0147402. doi: 10.1371/journal.pone.0147402. eCollection 2016.

Abstract

This paper presents a novel application of Bayesian nonparametrics (BNP) for marathon data modeling. We make use of two well-known BNP priors, the single-p dependent Dirichlet process and the hierarchical Dirichlet process, in order to address two different problems. First, we study the impact of age, gender and environment on the runners' performance. We derive a fair grading method that allows direct comparison of runners regardless of their age and gender. Unlike current grading systems, our approach is based not only on top world records, but on the performances of all runners. The presented methodology for comparison of densities can be adopted in many other applications straightforwardly, providing an interesting perspective to build dependent Dirichlet processes. Second, we analyze the running patterns of the marathoners in time, obtaining information that can be valuable for training purposes. We also show that these running patterns can be used to predict finishing time given intermediate interval measurements. We apply our models to New York City, Boston and London marathons.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Age Distribution
  • Aged
  • Algorithms
  • Athletic Performance
  • Bayes Theorem
  • Cluster Analysis
  • Female
  • Humans
  • Male
  • Middle Aged
  • Models, Statistical*
  • Pattern Recognition, Automated
  • Running
  • Sex Distribution
  • Statistics, Nonparametric
  • Young Adult

Grants and funding

The authors are grateful for financial support from the following institutions: Melanie F. Pradier is supported by the European Union 7th Framework Programme through the Marie Curie Initial Training Network “Machine Learning for Personalized Medicine” MLPM2012, Grant No. 316861 (http://mlpm.eu). Francisco J. R. Ruiz is supported by an FPU fellowship from the Spanish Ministry of Education (AP2010-5333). This work is also partially supported by the Ministerio de Economia of Spain (projects “COMONSENS”, id. CSD2008-00010, and “ALCIT”, id. TEC2012-38800-C03-01), by the Comunidad de Madrid (project “CASI-CAM-CM”, id. S2013/ICE-2845), and by the Office of Naval Research (ONR N00014-11-1-0651). Bell Labs Alcatel-Lucent provided support in the form of salaries for author FPC, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific role of these authors are articulated in the “author contributions” section.