Ensemble Based Approach for Time Series Classification in Metabolomics

Stud Health Technol Inform. 2019:260:89-96.

Abstract

Background: Machine learning is one important application in the area of health informatics, however classification methods for longitudinal data are still rare.

Objectives: The aim of this work is to analyze and classify differences in metabolite time series data between groups of individuals regarding their athletic activity.

Methods: We propose a new ensemble-based 2-tier approach to classify metabolite time series data. The first tier uses polynomial fitting to generate a class prediction for each metabolite. An induced classifier (k-nearest-neighbor or naïve bayes) combines the results to produce a final prediction. Metabolite levels of 47 individuals undergoing a cycle ergometry test were measured using mass spectrometry.

Results: In accordance with our previous work the statistical results indicate strong changes over time. We found only small but systematic differences between the groups. However, our proposed stacking approach obtained a mean accuracy of 78% using 10-fold cross-validation.

Conclusion: Our proposed classification approach allows a considerable classification performance for time series data with small differences between the groups.

Keywords: biomarkers; classification; kinetics; time series.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Humans
  • Machine Learning*
  • Medical Informatics*
  • Metabolomics*