SeSAM: software for automatic construction of order-robust linkage maps

BMC Bioinformatics. 2022 Nov 19;23(1):499. doi: 10.1186/s12859-022-05045-7.

Abstract

Background: Genotyping and sequencing technologies produce increasingly large numbers of genetic markers with potentially high rates of missing or erroneous data. Therefore, the construction of linkage maps is more and more complex. Moreover, the size of segregating populations remains constrained by cost issues and is less and less commensurate with the numbers of SNPs available. Thus, guaranteeing a statistically robust marker order requires that maps include only a carefully selected subset of SNPs.

Results: In this context, the SeSAM software allows automatic genetic map construction using seriation and placement approaches, to produce (1) a high-robustness framework map which includes as many markers as possible while keeping the order robustness beyond a given statistical threshold, and (2) a high-density total map including the framework plus almost all polymorphic markers. During this process, care is taken to limit the impact of genotyping errors and of missing data on mapping quality. SeSAM can be used with a wide range of biparental populations including from outcrossing species for which phases are inferred on-the-fly by maximum-likelihood during map elongation. The package also includes functions to simulate data sets, convert data formats, detect putative genotyping errors, visualize data and map quality (including graphical genotypes), and merge several maps into a consensus. SeSAM is also suitable for interactive map construction, by providing lower-level functions for 2-point and multipoint EM analyses. The software is implemented in a R package including functions in C++.

Conclusions: SeSAM is a fully automatic linkage mapping software designed to (1) produce a framework map as robust as desired by optimizing the selection of a subset of markers, and (2) produce a high-density map including almost all polymorphic markers. The software can be used with a wide range of biparental mapping populations including cases from outcrossing. SeSAM is freely available under a GNU GPL v3 license and works on Linux, Windows, and macOS platforms. It can be downloaded together with its user-manual and quick-start tutorial from ForgeMIA (SeSAM project) at https://forgemia.inra.fr/gqe-acep/sesam/-/releases.

Keywords: Automated software; Genetic mapping; Linkage; Marker order robustness; Seriation.

MeSH terms

  • Chromosome Mapping
  • Genetic Markers
  • Genotype
  • Polymorphism, Single Nucleotide*
  • Software*

Substances

  • Genetic Markers