A crowdsourcing database for the copy-number variation of the Spanish population

Daniel López-López; Gema Roldán; Jose L Fernández-Rueda; Gerrit Bostelmann; Rosario Carmona; Virginia Aquino; Javier Perez-Florido; Francisco Ortuño; Guillermo Pita; Rocío Núñez-Torres; Anna González-Neira; CSVS Crowdsourcing Group; María Peña-Chilet; Joaquin Dopazo

doi:10.1186/s40246-023-00466-8

A crowdsourcing database for the copy-number variation of the Spanish population

Hum Genomics. 2023 Mar 9;17(1):20. doi: 10.1186/s40246-023-00466-8.

Authors

Daniel López-López^{1

2

3}, Gema Roldán¹, Jose L Fernández-Rueda¹, Gerrit Bostelmann¹, Rosario Carmona^{1

3}, Virginia Aquino¹, Javier Perez-Florido^{1

2}, Francisco Ortuño^{1

4}, Guillermo Pita⁵, Rocío Núñez-Torres⁵, Anna González-Neira⁵; CSVS Crowdsourcing Group; María Peña-Chilet^{1

2

3}, Joaquin Dopazo^{6

7

8

9}

Collaborators

CSVS Crowdsourcing Group:
Angel Alonso, Josefa Salgado-Garrido, Sara Pasalodos-Sanchez, Carmen Ayuso, Pablo Minguez, Almudena Avila-Fernandez, Marta Corton, Rafael Artuch, Salud Borrego, Guillermo Antiñolo, Angel Carracedo, Jorge Amigo, Luis Antonio Castaño, Isabel Tejada, Aitor Delmiro, Carmina Espinos, Daniel Grinberg, Encarnación Guillén, Pablo Lapunzina, Jose Antonio Lopez-Escámez, Alvaro Gallego-Martinez, Ramón Martí, Eulalia Rovira, José Mª Millán, Miguel Angel Moreno, Matías Morin, Antonio Moreno-Galdó, Mónica Fernández-Cancio, Beatriz Morte, Victoriano Mulero, Diana García, Virginia Nunes, Francesc Palau, Belén Perez, Luis Pérez Jurado, Rosario Perona, Aurora Pujol, Feliciano Ramos, Esther Lopez, Antonia Ribes, Jordi Rosell, Jordi Surrallés

Affiliations

¹ Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.
² Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.
³ Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain.
⁴ Department of Computer Architecture and Computer Technology, University of Granada, 18071, Granada, Spain.
⁵ Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain.
⁶ Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain. joaquin.dopazo@juntadeandalucia.es.
⁷ Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain. joaquin.dopazo@juntadeandalucia.es.
⁸ Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain. joaquin.dopazo@juntadeandalucia.es.
⁹ FPS/ELIXIR-ES, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain. joaquin.dopazo@juntadeandalucia.es.

Abstract

Background: Despite being a very common type of genetic variation, the distribution of copy-number variations (CNVs) in the population is still poorly understood. The knowledge of the genetic variability, especially at the level of the local population, is a critical factor for distinguishing pathogenic from non-pathogenic variation in the discovery of new disease variants.

Results: Here, we present the SPAnish Copy Number Alterations Collaborative Server (SPACNACS), which currently contains copy number variation profiles obtained from more than 400 genomes and exomes of unrelated Spanish individuals. By means of a collaborative crowdsourcing effort whole genome and whole exome sequencing data, produced by local genomic projects and for other purposes, is continuously collected. Once checked both, the Spanish ancestry and the lack of kinship with other individuals in the SPACNACS, the CNVs are inferred for these sequences and they are used to populate the database. A web interface allows querying the database with different filters that include ICD10 upper categories. This allows discarding samples from the disease under study and obtaining pseudo-control CNV profiles from the local population. We also show here additional studies on the local impact of CNVs in some phenotypes and on pharmacogenomic variants. SPACNACS can be accessed at: http://csvs.clinbioinfosspa.es/spacnacs/ .

Conclusion: SPACNACS facilitates disease gene discovery by providing detailed information of the local variability of the population and exemplifies how to reuse genomic data produced for other purposes to build a local reference database.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Crowdsourcing*
DNA Copy Number Variations* / genetics
Databases, Factual
Genomics
Phenotype