On the consistency of orthology relationships

Mark Jones; Christophe Paul; Céline Scornavacca

doi:10.1186/s12859-016-1267-3

On the consistency of orthology relationships

BMC Bioinformatics. 2016 Nov 11;17(Suppl 14):416. doi: 10.1186/s12859-016-1267-3.

Authors

Mark Jones¹, Christophe Paul², Céline Scornavacca³

Affiliations

¹ LIRMM, CNRS, Université de Montpellier, Montpellier, France.
² ISE-M, CNRS, IRD, EPHE, Université, Montpellier, France.
³ ISE-M, CNRS, IRD, EPHE, Université, Montpellier, France. celine.scornavacca@umontpellier.fr.

Abstract

Background: Orthologs inference is the starting point of most comparative genomics studies, and a plethora of methods have been designed in the last decade to address this challenging task. In this paper we focus on the problems of deciding consistency with a species tree (known or not) of a partial set of orthology/paralogy relationships [Formula: see text] on a collection of n genes.

Results: We give the first polynomial algorithm - more precisely a O(n ³) time algorithm - to decide whether [Formula: see text] is consistent, even when the species tree is unknown. We also investigate a biologically meaningful optimization version of these problems, in which we wish to minimize the number of duplication events; unfortunately, we show that all these optimization problems are NP-hard and are unlikely to have good polynomial time approximation algorithms.

Conclusions: Our polynomial algorithm for checking consistency has been implemented in Python and is available at https://github.com/UdeM-LBIT/OrthoPara-ConstraintChecker .

Keywords: Inapproximability; Orthology detection; Para-NP hardness; Polynomial-time algorithms.

MeSH terms

Algorithms*
Genomics*
Internet
User-Computer Interface