MLSTest: novel software for multi-locus sequence data analysis in eukaryotic organisms

Infect Genet Evol. 2013 Dec:20:188-96. doi: 10.1016/j.meegid.2013.08.029. Epub 2013 Sep 8.

Abstract

Multi-locus sequence typing (MLST) is a frequently used genotyping method whose goal is the unambiguous assignment of microorganisms to genetic clusters. MLST typically involves analysis of DNA sequence results generated from several house-keeping gene loci. MLST remains the gold standard for molecular typing of many bacterial pathogens. Eukaryotic pathogens have also been the subject of MLST, however, few tools are available to deal with diploid sequence data. Here we present novel software for MLST data analysis tailored towards diploid Eukaryotes: MLSTest. This software meets various methods used in MLST and introduces some novel methodologies for the evaluation of the data set. In addition to construction of allelic profiles and basic clustering analysis, the MLSTest looks for network structures that suggest genetic exchange in BURST graphs. Additionally, it uses several simple methods for tree construction with the advantage of managing heterozygous or three-state sites. Additionally, the software analyses whether concatenation of fragments from different genes is suitable for the data set using different tests (bionj-incongruence length difference test, Templeton test). It evaluates how the incongruence is distributed across the tree using a variation of the localized incongruence length difference test based on a modified neighbour joining algorithm. We tested the last method in simulated datasets. We showed that is conservative (adequate type I error rate) and moderately to highly powerful as well as useful to localize incongruences in two bacterial and two eukaryotic MLST datasets. MLSTest was also designed for developing MLST schemes. It thus has tools to optimize locus combinations and to reduce the number of targets required for typing. MLSTest also analyses whether the discriminatory power of the typing scheme is increased by including more loci. We evaluated the software over simulated and real datasets from bacterial and eukaryotic microorganisms. The software is freely available at http://www.ipe.unsa.edu.ar/software.

Keywords: Concatenation; Incongruence; MLST; MLSTest; Multi-locus sequence typing; Software.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Aspergillus fumigatus / genetics
  • Candida glabrata / genetics
  • Diploidy*
  • Eukaryota / genetics*
  • Genomics / methods*
  • Haemophilus influenzae / genetics
  • Multilocus Sequence Typing / methods*
  • Neisseria meningitidis / genetics
  • Software
  • Trypanosoma cruzi / genetics