Discovering and deciphering relationships across disparate data modalities

Joshua T Vogelstein; Eric W Bridgeford; Qing Wang; Carey E Priebe; Mauro Maggioni; Cencheng Shen

doi:10.7554/eLife.41690

Discovering and deciphering relationships across disparate data modalities

Elife. 2019 Jan 15:8:e41690. doi: 10.7554/eLife.41690.

Authors

Joshua T Vogelstein^{1

2}, Eric W Bridgeford¹, Qing Wang¹, Carey E Priebe¹, Mauro Maggioni¹, Cencheng Shen³

Affiliations

¹ Johns Hopkins University, Baltimore, United States.
² Child Mind Institute, New York, United States.
³ University of Delaware, Delaware, United States.

Abstract

Understanding the relationships between different properties of data, such as whether a genome or connectome has information about disease status, is increasingly important. While existing approaches can test whether two properties are related, they may require unfeasibly large sample sizes and often are not interpretable. Our approach, 'Multiscale Graph Correlation' (MGC), is a dependence test that juxtaposes disparate data science techniques, including k-nearest neighbors, kernel methods, and multiscale analysis. Other methods may require double or triple the number of samples to achieve the same statistical power as MGC in a benchmark suite including high-dimensional and nonlinear relationships, with dimensionality ranging from 1 to 1000. Moreover, MGC uniquely characterizes the latent geometry underlying the relationship, while maintaining computational efficiency. In real data, including brain imaging and cancer genetics, MGC detects the presence of a dependency and provides guidance for the next experiments to conduct.

Keywords: computational biology; data science; human; machine learning; neuroscience; statistics; systems biology.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Biomarkers, Tumor / metabolism
Brain / diagnostic imaging
Brain / physiology
Computer Simulation
Data Analysis*
Humans
Neoplasms / metabolism
Sample Size

Substances

Biomarkers, Tumor

Grants and funding

Endeavor Scientist Program/Child Mind Institute/International