Classification and identification of fungal sequences using characteristic restriction endonuclease cut order

J Bioinform Comput Biol. 2010 Apr;8(2):181-98. doi: 10.1142/s0219720010004616.

Abstract

Restriction Fragment Length Polymorphism (RFLP) is a powerful molecular tool that is extensively used in the molecular fingerprinting and epidemiological studies of microorganisms. In a wet-lab setting, the DNA is cut with one or more restriction enzymes and subjected to gel electrophoresis to obtain signature fragment patterns, which is utilized in the classification and identification of organisms. This wet-lab approach may not be practical when the experimental data set includes a large number of genetic sequences and a wide pool of restriction enzymes to choose from. In this study, we introduce a novel concept of Enzyme Cut Order - a biological property-based characteristic of DNA sequences which can be defined and analyzed computationally without any alignment algorithm. In this alignment-free approach, a similarity matrix is developed based on the pairwise Longest Common Subsequences (LCS) of the Enzyme Cut Orders. The choice of an ideal set of restriction enzymes used for analysis is augmented by using genetic algorithms. The results obtained from this approach using internal transcribed spacer regions of rDNA from fungi as the target sequence show that the phylogenetically-related organisms form a single cluster and successful grouping of phylogenetically close or distant organisms is dependent on the choice of restriction enzymes used in the analysis. Additionally, comparison of trees obtained with this alignment-free and the legacy method revealed highly similar tree topologies. This novel alignment-free method, which utilizes the Enzyme Cut Order and restriction enzyme profile, is a reliable alternative to local or global alignment-based classification and identification of organisms.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Base Sequence
  • Cluster Analysis
  • Computational Biology
  • DNA, Fungal / classification*
  • DNA, Fungal / genetics*
  • DNA, Ribosomal / classification
  • DNA, Ribosomal / genetics
  • Databases, Nucleic Acid
  • Fungi / classification*
  • Fungi / genetics*
  • Phylogeny
  • Polymorphism, Restriction Fragment Length
  • Software Design

Substances

  • DNA, Fungal
  • DNA, Ribosomal