Scalable biclustering - the future of big data exploration?

Gigascience. 2019 Jul 1;8(7):giz078. doi: 10.1093/gigascience/giz078.

Abstract

Biclustering is a technique of discovering local similarities within data. For many years the complexity of the methods and parallelization issues limited its application to big data problems. With the development of novel scalable methods, biclustering has finally started to close this gap. In this paper we discuss the caveats of biclustering and present its current challenges and guidelines for practitioners. We also try to explain why biclustering may soon become one of the standards for big data analytics.

Keywords: biclustering; big data; biomarker detection; co-clustering; data mining; disease subtype identification; gene-drug interaction; parallel algorithms; precision medicine.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Big Data*
  • Cluster Analysis
  • Data Mining / methods
  • Genome, Human
  • Genomics / methods*
  • Genomics / standards
  • Humans
  • Sequence Alignment / methods
  • Sequence Alignment / standards
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Software