Hierarchical classification of microorganisms based on high-dimensional phenotypic data

J Biophotonics. 2018 Mar;11(3). doi: 10.1002/jbio.201700047. Epub 2017 Dec 12.

Abstract

The classification of microorganisms by high-dimensional phenotyping methods such as FTIR spectroscopy is often a complicated process due to the complexity of microbial phylogenetic taxonomy. A hierarchical structure developed for such data can often facilitate the classification analysis. The hierarchical tree structure can either be imposed to a given set of phenotypic data by integrating the phylogenetic taxonomic structure or set up by revealing the inherent clusters in the phenotypic data. In this study, we wanted to compare different approaches to hierarchical classification of microorganisms based on high-dimensional phenotypic data. A set of 19 different species of molds (filamentous fungi) obtained from the mycological strain collection of the Norwegian Veterinary Institute (Oslo, Norway) is used for the study. Hierarchical cluster analysis is performed for setting up the classification trees. Classification algorithms such as artificial neural networks (ANN), partial least-squared discriminant analysis and random forest (RF) are used and compared. The 2 methods ANN and RF outperformed all the other approaches even though they did not utilize predefined hierarchical structure. To our knowledge, the RF approach is used here for the first time to classify microorganisms by FTIR spectroscopy.

Keywords: FTIR spectroscopy of microorganisms; classification analysis; hierarchical tree structure.

MeSH terms

  • Classification / methods*
  • Discriminant Analysis
  • Fungi / classification*
  • Least-Squares Analysis
  • Neural Networks, Computer
  • Phenotype*
  • Phylogeny