Characteristic Attribute Organization System (CAOS): Identifying Classification Rules Based on Phylogenetically Organized Sequences

Methods Mol Biol. 2024:2744:335-345. doi: 10.1007/978-1-0716-3581-0_21.

Abstract

Classification is a technique that labels subjects based on the characteristics of the data. It often includes using prior learned information from preexisting data drawn from the same distribution or data type to make informed decisions per each given subject. The method presented here, the Characteristic Attribute Organization System (CAOS), uses a character-based approach to molecular sequence classification. Using a set of aligned sequences (either nucleotide or amino acid) and a maximum parsimony tree, CAOS will generate classification rules for the sequences based on tree structure and provide more interpretable results than other classification or sequence analysis protocols. The code is accessible at https://github.com/JuliaHealth/CAOS.jl/ .

Keywords: CAOS; Classification; DNA barcoding; Diagnosis.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Phylogeny*
  • Sequence Alignment / methods
  • Software*