CNest: A novel copy number association discovery method uncovers 862 new associations from 200,629 whole-exome sequence datasets in the UK Biobank

Cell Genom. 2022 Aug 10;2(8):100167. doi: 10.1016/j.xgen.2022.100167.

Abstract

Copy number variation (CNV) is known to influence human traits, having a rich history of research into common and rare genetic disease, and although CNV is accepted as an important class of genomic variation, progress on copy-number-based genome-wide association studies (GWASs) from next-generation sequencing (NGS) data has been limited. Here we present a novel method for large-scale copy number analysis from NGS data generating robust copy number estimates and allowing copy number GWASs (CN-GWASs) to be performed genome-wide in discovery mode. We provide a detailed analysis in the UK Biobank resource and a specifically designed software package. We use these methods to perform CN-GWAS analysis across 78 human traits, discovering over 800 genetic associations that are likely to contribute strongly to trait distributions. Finally, we compare CNV and SNP association signals across the same traits and samples, defining specific CNV association classes.

Keywords: copy number variation; genome-wide association studies; next-generation sequencing; whole-exome sequencing.