WaveDec: A Wavelet Approach to Identify Both Shared and Individual Patterns of Copy-Number Variations

IEEE Trans Biomed Eng. 2018 Feb;65(2):353-364. doi: 10.1109/TBME.2017.2769677.

Abstract

Copy-number variations (CNVs) are associated with complex diseases and particular tumor types. Array-based comparative genomic hybridization (aCGH) is a common approach for the detection of CNVs. Traditional CNV detection methods for multiple aCGH samples mainly use batch samples to find common variations, not accounting for the individual characteristics of each sample. Accurately differentiating both the commonly shared and the individual CNV patterns is pivotal to identify cell populations, or to distinguish cell growth (as in cancer) from invasion of new cells. Our preliminary results have now demonstrated that both the shared and individual CNV patterns have distinctive characteristics after wavelet transform.

Methods: To exploit these characteristics, we propose to formulate a quadratic data-separation problem within the wavelet space to discriminate the shared and individual CNVs from raw data. We have elaborated a numerical solution and shown that the solution can be obtained by solving decoupled subproblems. By this approach, computational costs can be limited, enabling efficient application in the analysis of large sequencing datasets.

Results: The advantages of our proposed method, called WaveDec, have been demonstrated by comparison with popular CNV-detection methods using synthetic and empirical aCGH data. The performance of WaveDec was further validated by experiments with single-cell-sequencing data.

Conclusion: WaveDec can successfully differentiate shared and individual patterns, and performs well even in data contaminated with high levels of noise.

Significance: Both the shared and individual patterns can be uniquely characterized as well as effectively decomposed within the wavelet space.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Comparative Genomic Hybridization / methods*
  • Computational Biology / methods*
  • DNA Copy Number Variations / genetics*
  • Databases, Genetic
  • Humans
  • ROC Curve
  • Sequence Analysis, DNA / methods
  • Wavelet Analysis*