Core Statistical Methods for Chemogenomic Data

Christin Rakers

doi:10.1007/978-1-4939-8639-2_7

Core Statistical Methods for Chemogenomic Data

Methods Mol Biol. 2018:1825:227-277. doi: 10.1007/978-1-4939-8639-2_7.

Author

Christin Rakers^{1

2}

Affiliations

¹ Graduate School of Pharmaceutical Sciences, Yoshida-shimoadachicho, Kyoto University, Sakyo-ku, Kyoto, Japan. rakers@pharm.kyoto-u.ac.jp.
² Graduate School of Science Nagoya University, Nagoya, Japan. rakers@pharm.kyoto-u.ac.jp.

PMID: 30334208
DOI: 10.1007/978-1-4939-8639-2_7

Abstract

Chemogenomic modeling involves the construction of algorithmic or statistical models for prediction on new input data and is often based on noisy, multidescriptor data. A deeper understanding of such data through statistical analyses can underpin informed study design and increase information gain from prediction results and model performances. This chapter mediates basic statistical concepts and provides step-by-step instructions to explore and visualize chemogenomic data based on the statistics-centered, open-source software R. Directions on executing essential techniques such as the calculation of correlations, hypothesis testing, and clustering are provided.

Keywords: Chemogenomic data; Clustering; Correlation; Feature importance; Hypothesis testing; Normality.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Data Visualization
Databases, Factual
Genomics / methods*
Humans
Models, Statistical*
Pharmaceutical Preparations / chemistry*
Software*

Substances

Pharmaceutical Preparations