Read count approach for DNA copy number variants detection

Alberto Magi; Lorenzo Tattini; Tommaso Pippucci; Francesca Torricelli; Matteo Benelli

doi:10.1093/bioinformatics/btr707

Read count approach for DNA copy number variants detection

Bioinformatics. 2012 Feb 15;28(4):470-8. doi: 10.1093/bioinformatics/btr707. Epub 2011 Dec 23.

Authors

Alberto Magi¹, Lorenzo Tattini, Tommaso Pippucci, Francesca Torricelli, Matteo Benelli

Affiliation

¹ Faculty of Medicine, University of Florence, Florence 50019, Italy. albertomagi@gmail.com

PMID: 22199393
DOI: 10.1093/bioinformatics/btr707

Abstract

Motivation: The advent of high-throughput sequencing technologies is revolutionizing our ability in discovering and genotyping DNA copy number variants (CNVs). Read count-based approaches are able to detect CNV regions with an unprecedented resolution. Although this computational strategy has been recently introduced in literature, much work has been already done for the preparation, normalization and analysis of this kind of data.

Results: Here we face the many aspects that cover the detection of CNVs by using read count approach. We first study the characteristics and systematic biases of read count distributions, focusing on the normalization methods designed for removing these biases. Subsequently, we compare the algorithms designed to detect the boundaries of CNVs and we investigate the ability of read count data to predict the exact number of DNA copy. Finally, we review the tools publicly available for analysing read count data. To better understand the state of the art of read count approaches, we compare the performance of the three most widely used sequencing technologies (Illumina Genome Analyzer, Roche 454 and Life Technologies SOLiD) in all the analyses that we perform.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
DNA Copy Number Variations*
Genome, Human
High-Throughput Nucleotide Sequencing
Humans
Sequence Analysis, DNA / methods*