Bayesian predictiveness, exchangeability and sufficientness in bacterial taxonomy

Mats Gyllenberg; Timo Koski

doi:10.1016/s0025-5564(01)00096-7

Bayesian predictiveness, exchangeability and sufficientness in bacterial taxonomy

Math Biosci. 2002 May-Jun:177-178:161-84. doi: 10.1016/s0025-5564(01)00096-7.

Authors

Mats Gyllenberg¹, Timo Koski

Affiliation

¹ Department of Mathematics, University of Turku, 20014 Turku, Finland. mats.gyllenberg@utu.fi

PMID: 11965254
DOI: 10.1016/s0025-5564(01)00096-7

Abstract

We present a theory of classification and predictive identification of bacteria. Bacterial strains are characterized by a binary vector and the taxonomy is specified by attaching a label to each vector. The theory is developed from only two basic assumptions, viz. that the sequence of pairs of feature vectors and the attached labels is judged (infinitely) exchangeable and predictively sufficient. We derive expressions for the training error and the probability of identification error and show that latter is an affine function of the former. We prove the law of large numbers for identification matrices, which contain the fundamental information of bacterial data. We prove the Bayesian risk consistency of the predictive identification rule given by the theory and show that the training error is a consistent estimate of the generalization error.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bacteria / classification*
Bayes Theorem*
Classification / methods*
Models, Biological*