Sorting Permutations by Intergenic Operations

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2080-2093. doi: 10.1109/TCBB.2021.3077418. Epub 2021 Dec 8.

Abstract

Genome Rearrangements are events that affect large stretches of genomes during evolution. Many mathematical models have been used to estimate the evolutionary distance between two genomes based on genome rearrangements. However, most of them focused on the (order of the) genes of a genome, disregarding other important elements in it. Recently, researchers have shown that considering regions between each pair of genes, called intergenic regions, can enhance distance estimation in realistic data. Two of the most studied genome rearrangements are the reversal, which inverts a sequence of genes, and the transposition, which occurs when two adjacent gene sequences swap their positions inside the genome. In this work, we study the transposition distance between two genomes, but we also consider intergenic regions, a problem we name Sorting by Intergenic Transpositions. We show that this problem is NP-hard and propose two approximation algorithms, with factors 3.5 and 2.5, considering two distinct definitions for the problem. We also investigate the signed reversal and transposition distance between two genomes considering their intergenic regions. This second problem is called Sorting by Signed Intergenic Reversals and Intergenic Transpositions. We show that this problem is NP-hard and develop two approximation algorithms, with factors 3 and 2.5. We check how these algorithms behave when assigning weights for genome rearrangements. Finally, we implemented all these algorithms and tested them on real and simulated data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • DNA Transposable Elements / genetics
  • DNA, Intergenic / genetics
  • Gene Rearrangement / genetics*
  • Genome / genetics*
  • Genomics / methods*
  • Sequence Analysis, DNA

Substances

  • DNA Transposable Elements
  • DNA, Intergenic