A heuristic algorithm solving the mutual-exclusivity-sorting problem

Bioinformatics. 2023 Jan 1;39(1):btad016. doi: 10.1093/bioinformatics/btad016.

Abstract

Motivation: Binary (or Boolean) matrices provide a common effective data representation adopted in several domains of computational biology, especially for investigating cancer and other human diseases. For instance, they are used to summarize genetic aberrations-copy number alterations or mutations-observed in cancer patient cohorts, effectively highlighting combinatorial relations among them. One of these is the tendency for two or more genes not to be co-mutated in the same sample or patient, i.e. a mutual-exclusivity trend. Exploiting this principle has allowed identifying new cancer driver protein-interaction networks and has been proposed to design effective combinatorial anti-cancer therapies rationally. Several tools exist to identify and statistically assess mutual-exclusive cancer-driver genomic events. However, these tools need to be equipped with robust/efficient methods to sort rows and columns of a binary matrix to visually highlight possible mutual-exclusivity trends.

Results: Here, we formalize the mutual-exclusivity-sorting problem and present MutExMatSorting: an R package implementing a computationally efficient algorithm able to sort rows and columns of a binary matrix to highlight mutual-exclusivity patterns. Particularly, our algorithm minimizes the extent of collective vertical overlap between consecutive non-zero entries across rows while maximizing the number of adjacent non-zero entries in the same row. Here, we demonstrate that existing tools for mutual-exclusivity analysis are suboptimal according to these criteria and are outperformed by MutExMatSorting.

Availability and implementation: https://github.com/AleVin1995/MutExMatSorting.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Genomics
  • Heuristics*
  • Humans
  • Mutation
  • Neoplasms* / genetics