Accessing the High-Throughput Screening Data Landscape

Methods Mol Biol. 2016:1473:153-9. doi: 10.1007/978-1-4939-6346-1_16.

Abstract

The progress of high-throughput screening (HTS) techniques is changing the chemical data landscape by producing massive biological data from tested compounds. Public data repositories (e.g., PubChem) receive HTS data provided by various institutes and this data pool is being updated on a daily basis. The goal of these data sharing efforts is to let users quickly obtain the biological data of target compounds. Without a universal chemical identifier, the repositories (e.g., PubChem) provide users various methods to query and retrieve chemical properties and biological data by several different chemical identifiers (e.g., SMILES, InChIKey, and IUPAC name). The major challenge for most users, especially computational modelers, is obtaining the biological data for a large dataset of compounds (e.g., thousands of drug molecules) instead of a single compound. This chapter aims to introduce the steps to access the public data repositories for target compounds with specific emphasis on the automatic data downloading for large datasets.

Keywords: Biological data; Chemical identifier; Compounds; PubChem.

MeSH terms

  • Algorithms
  • Aspirin / pharmacology
  • Cell Line
  • Cell Survival / drug effects
  • Databases, Chemical
  • Dose-Response Relationship, Drug
  • Gene Expression
  • High-Throughput Screening Assays / standards*
  • Humans
  • Information Dissemination*
  • Proteins / agonists
  • Proteins / antagonists & inhibitors
  • Proteins / genetics*
  • Proteins / metabolism
  • Software*
  • Xenobiotics / toxicity*

Substances

  • Proteins
  • Xenobiotics
  • Aspirin