Chemogenomic modeling involves the construction of algorithmic or statistical models for prediction on new input data and is often based on noisy, multidescriptor data. A deeper understanding of such data through statistical analyses can underpin informed study design and increase information gain from prediction results and model performances. This chapter mediates basic statistical concepts and provides step-by-step instructions to explore and visualize chemogenomic data based on the statistics-centered, open-source software R. Directions on executing essential techniques such as the calculation of correlations, hypothesis testing, and clustering are provided.
Keywords: Chemogenomic data; Clustering; Correlation; Feature importance; Hypothesis testing; Normality.