Algorithms for sequence analysis via mutagenesis

Bioinformatics. 2004 Oct 12;20(15):2401-10. doi: 10.1093/bioinformatics/bth258. Epub 2004 May 14.

Abstract

Motivation: Despite many successes of conventional DNA sequencing methods, some DNAs remain difficult or impossible to sequence. Unsequenceable regions occur in the genomes of many biologically important organisms, including the human genome. Such regions range in length from tens to millions of bases, and may contain valuable information such as the sequences of important genes. The authors have recently developed a technique that renders a wide range of problematic DNAs amenable to sequencing. The technique is known as sequence analysis via mutagenesis (SAM). This paper presents a number of algorithms for analysing and interpreting data generated by this technique.

Results: The essential idea of SAM is to infer the target sequence using the sequences of mutants derived from the target. We describe three algorithms used in this process. The first algorithm predicts the number of mutants that will be required to infer the target sequence with a desired level of accuracy. The second algorithm infers the target sequence itself, using the mutant sequences. The third algorithm assigns quality values to each inferred base. The algorithms are illustrated using mutant sequences generated in the laboratory.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Algorithms*
  • Animals
  • Computer Simulation
  • DNA Mutational Analysis / methods*
  • Dictyostelium / genetics
  • Models, Genetic*
  • Models, Statistical
  • Mutagenesis / genetics*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Software*