Towards reproducible computational drug discovery

J Cheminform. 2020 Jan 28;12(1):9. doi: 10.1186/s13321-020-0408-x.

Authors

Nalini Schaduangrat¹, Samuel Lampa², Saw Simeon³, Matthew Paul Gleeson⁴, Ola Spjuth⁵, Chanin Nantasenamat⁶

Affiliations

¹ Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand.
² Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden.
³ Interdisciplinary Graduate Program in Bioscience, Faculty of Science, Kasetsart University, 10900, Bangkok, Thailand.
⁴ Department of Biomedical Engineering, Faculty of Engineering, King Mongkut's Institute of Technology Ladkrabang, 10520, Bangkok, Thailand. paul.gl@kmitl.ac.th.
⁵ Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden. ola.spjuth@farmbio.uu.se.
⁶ Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand. chanin.nan@mahidol.edu.

Abstract

The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.

Keywords: Bioinformatics; Cheminformatics; Data science; Data sharing; Drug design; Drug discovery; Open data; Open science; Reproducibility; Reproducible research.

Publication types

Review

Grants and funding

RSA6280075/Thailand Research Fund