Background: Machine learning is one important application in the area of health informatics, however classification methods for longitudinal data are still rare.
Objectives: The aim of this work is to analyze and classify differences in metabolite time series data between groups of individuals regarding their athletic activity.
Methods: We propose a new ensemble-based 2-tier approach to classify metabolite time series data. The first tier uses polynomial fitting to generate a class prediction for each metabolite. An induced classifier (k-nearest-neighbor or naïve bayes) combines the results to produce a final prediction. Metabolite levels of 47 individuals undergoing a cycle ergometry test were measured using mass spectrometry.
Results: In accordance with our previous work the statistical results indicate strong changes over time. We found only small but systematic differences between the groups. However, our proposed stacking approach obtained a mean accuracy of 78% using 10-fold cross-validation.
Conclusion: Our proposed classification approach allows a considerable classification performance for time series data with small differences between the groups.
Keywords: biomarkers; classification; kinetics; time series.