CNSA: a data repository for archiving omics data

Xueqin Guo; Fengzhen Chen; Fei Gao; Ling Li; Ke Liu; Lijin You; Cong Hua; Fan Yang; Wanliang Liu; Chunhua Peng; Lina Wang; Xiaoxia Yang; Feiyu Zhou; Jiawei Tong; Jia Cai; Zhiyong Li; Bo Wan; Lei Zhang; Tao Yang; Minwen Zhang; Linlin Yang; Yawen Yang; Wenjun Zeng; Bo Wang; Xiaofeng Wei; Xun Xu

doi:10.1093/database/baaa055

CNSA: a data repository for archiving omics data

Database (Oxford). 2020 Jan 1:2020:baaa055. doi: 10.1093/database/baaa055.

Authors

Xueqin Guo¹, Fengzhen Chen¹, Fei Gao¹, Ling Li¹, Ke Liu¹, Lijin You¹, Cong Hua¹, Fan Yang¹, Wanliang Liu¹, Chunhua Peng¹, Lina Wang¹, Xiaoxia Yang¹, Feiyu Zhou¹, Jiawei Tong¹, Jia Cai¹, Zhiyong Li¹, Bo Wan¹, Lei Zhang¹, Tao Yang¹, Minwen Zhang¹, Linlin Yang¹, Yawen Yang¹, Wenjun Zeng¹, Bo Wang¹, Xiaofeng Wei¹, Xun Xu^{1

2

3}

Affiliations

¹ China National GeneBank, Shenzhen 518120, China.
² BGI-Shenzhen, Shenzhen 518083, China.
³ Guangdong Provincial Key Laboratory of Genome Read and Write, Shenzhen 518120, China.

Abstract

With the application and development of high-throughput sequencing technology in life and health sciences, massive multi-omics data brings the problem of efficient management and utilization. Database development and biocuration are the prerequisites for the reuse of these big data. Here, relying on China National GeneBank (CNGB), we present CNGB Sequence Archive (CNSA) for archiving omics data, including raw sequencing data and its further analyzed results which are organized into six objects, namely Project, Sample, Experiment, Run, Assembly and Variation at present. Moreover, CNSA has created a correlation model of living samples, sample information and analytical data on some projects. Both living samples and analytical data are directly correlated with the sample information. From either one, information or data of the other two can be obtained, so that all data can be traced throughout the life cycle from the living sample to the sample information to the analytical data. Complying with the data standards commonly used in the life sciences, CNSA is committed to building a comprehensive and curated data repository for storing, managing and sharing of omics data. We will continue to improve the data standards and provide free access to open-data resources for worldwide scientific communities to support academic research and the bio-industry. Database URL: https://db.cngb.org/cnsa/.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Big Data
Computational Biology
Data Curation*
Database Management Systems*
Databases, Genetic*