Dataset on aquatic ecotoxicity predictions of 2697 chemicals, using three quantitative structure-activity relationship platforms

Data Brief. 2023 Oct 24:51:109719. doi: 10.1016/j.dib.2023.109719. eCollection 2023 Dec.

Abstract

Empirical and in silico data on the aquatic ecotoxicology of 2697 organic chemicals were collected in order to compile a dataset for assessing the predictive power of current Quantitative Structure Activity Relationship (QSAR) models and software platforms. This document presents the dataset and the data pipeline for its creation. Empirical data were collected from the US EPA ECOTOX Knowledgebase (ECOTOX) and the EFSA (European Food Safety Authority) report "Completion of data entry of pesticide ecotoxicology Tier 1 study endpoints in a XML schema - database". Only data for OECD recommended algae, daphnia and fish species were retained. QSAR toxicity predictions were calculated for each chemical and each of six endpoints using ECOSAR, VEGA and the Toxicity Estimation Software Tool (T.E.S.T.) platforms. Finally, the dataset was amended with SMILES, InChIKey, pKa and logP collected from webchem and PubChem.

Keywords: Chemical toxicity; ECOSAR; Quantitative structure-activity relationship; Toxicity estimation software tool; VEGA.