An Introduction to Infinite HMMs for Single-Molecule Data Analysis

Biophys J. 2017 May 23;112(10):2021-2029. doi: 10.1016/j.bpj.2017.04.027.

Abstract

The hidden Markov model (HMM) has been a workhorse of single-molecule data analysis and is now commonly used as a stand-alone tool in time series analysis or in conjunction with other analysis methods such as tracking. Here, we provide a conceptual introduction to an important generalization of the HMM, which is poised to have a deep impact across the field of biophysics: the infinite HMM (iHMM). As a modeling tool, iHMMs can analyze sequential data without a priori setting a specific number of states as required for the traditional (finite) HMM. Although the current literature on the iHMM is primarily intended for audiences in statistics, the idea is powerful and the iHMM's breadth in applicability outside machine learning and data science warrants a careful exposition. Here, we explain the key ideas underlying the iHMM, with a special emphasis on implementation, and provide a description of a code we are making freely available. In a companion article, we provide an important extension of the iHMM to accommodate complications such as drift.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Markov Chains*
  • Models, Molecular*
  • Statistics, Nonparametric