Robust group fused lasso for multisample copy number variation detection under uncertainty

IET Syst Biol. 2016 Dec;10(6):229-236. doi: 10.1049/iet-syb.2015.0081.

Abstract

One of the most important needs in the post-genome era is providing the researchers with reliable and efficient computational tools to extract and analyse this huge amount of biological data, in which DNA copy number variation (CNV) is a vitally important one. Array-based comparative genomic hybridisation (aCGH) is a common approach in order to detect CNVs. Most of methods for this purpose were proposed for one-dimensional profiles. However, slightly this focus has moved from one- to multi-dimensional signals. In addition, since contamination of these profiles with noise is always an issue, it is highly important to have a robust method for analysing multi-sample aCGH profiles. In this study, the authors propose robust group fused lasso which utilises the robust group total variations. Instead of l2,1 norm, the l1 - l2 M-estimator is used which is more robust in dealing with non-Gaussian noise and high corruption. More importantly, Correntropy (Welsch M-estimator) is also applied for fitting error. Extensive experiments indicate that the proposed method outperforms the state-of-the art algorithms and techniques under a wide range of scenarios with diverse noises.

MeSH terms

  • Algorithms
  • Breast Neoplasms / metabolism
  • Comparative Genomic Hybridization*
  • Computational Biology
  • DNA Copy Number Variations*
  • Databases, Genetic
  • Female
  • Genome, Human
  • Humans
  • Models, Statistical
  • Normal Distribution
  • Polymorphism, Single Nucleotide
  • ROC Curve
  • Signal-To-Noise Ratio
  • Software
  • Uncertainty