Secure top most significant genome variants search: iDASH 2017 competition

BMC Med Genomics. 2018 Oct 11;11(Suppl 4):82. doi: 10.1186/s12920-018-0399-x.

Abstract

Background: One of the 3 tracks of iDASH Privacy & Security Workshop 2017 competition was to execute a whole genome variants search on private genomic data. Particularly, the search application was to find the top most significant SNPs (Single-Nucleotide Polymorphisms) in a database of genome records labeled with control or case. In this paper we discuss the solution submitted by our team to this competition.

Methods: Privacy and confidentiality of genome data had to be ensured using Intel SGX enclaves. The typical use-case of this application is the multi-party computation (each party possessing one or several genome records) of the SNPs which statistically differentiate control and case genome datasets.

Results: Our solution consists of two applications: (i) compress and encrypt genome files and (ii) perform genome processing (top most important SNPs search). We have opted for a horizontal treatment of genome records and heavily used parallel processing. Rust programming language was employed to develop both applications.

Conclusions: Execution performance of the processing applications scales well and very good performance metrics are obtained. Contest organizers selected it as the best submission amongst other received competition entries and our team was awarded the first prize on this track.

Keywords: Genome variants search; Genomic data privacy; IDASH competition; Intel SGX.

MeSH terms

  • Algorithms
  • Computer Security*
  • Genome*
  • Humans
  • Polymorphism, Single Nucleotide / genetics*
  • Programming Languages
  • Software