SEDA: A Desktop Tool Suite for FASTA Files Processing

IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1850-1860. doi: 10.1109/TCBB.2020.3040383. Epub 2022 Jun 3.

Abstract

SEDA (SEquence DAtaset builder) is a multiplatform desktop application for the manipulation of FASTA files containing DNA or protein sequences. The convenient graphical user interface gives access to a collection of simple (filtering, sorting, or file reformatting, among others) and advanced (BLAST searching, protein domain annotation, gene annotation, and sequence alignment) utilities not present in similar applications, which eases the work of life science researchers working with DNA and/or protein sequences, especially those who have no programming skills. This paper presents general guidelines on how to build efficient data handling protocols using SEDA, as well as practical examples on how to prepare high-quality datasets for single gene phylogenetic studies, the characterization of protein families, or phylogenomic studies. The user-friendliness of SEDA also relies on two important features: (i) the availability of easy-to-install distributable versions and installers of SEDA, including a Docker image for Linux, and (ii) the facility with which users can manage large datasets. SEDA is open-source, with GNU General Public License v3.0 license, and publicly available at GitHub (https://github.com/sing-group/seda). SEDA installers and documentation are available at https://www.sing-group.org/seda/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Phylogeny
  • Proteins*
  • Sequence Alignment
  • Software*

Substances

  • Proteins