Secure analysis of distributed chemical databases without data integration

J Comput Aided Mol Des. 2005 Sep-Oct;19(9-10):739-47. doi: 10.1007/s10822-005-9011-5. Epub 2005 Nov 3.

Abstract

We present a method for performing statistically valid linear regressions on the union of distributed chemical databases that preserves confidentiality of those databases. The method employs secure multi-party computation to share local sufficient statistics necessary to compute least squares estimators of regression coefficients, error variances and other quantities of interest. We illustrate our method with an example containing four companies' rather different databases.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Computer Security
  • Databases, Factual*
  • Least-Squares Analysis
  • Linear Models
  • Models, Chemical*
  • Organic Chemicals / chemistry
  • Regression Analysis
  • Solubility
  • Water

Substances

  • Organic Chemicals
  • Water