PyComp: A Versatile Tool for Efficient Data Extraction, Conversion, and Management in High-throughput Virtual Drug Screening

Curr Comput Aided Drug Des. 2024 Jan 8. doi: 10.2174/0115734099274495231218150611. Online ahead of print.

Abstract

Background: Virtual screening (VS) is essential for analyzing potential drug candidates in drug discovery. Often, this involves the conversion of large volumes of compound data into specific formats suitable for computational analysis. Managing and processing this wealth of information, especially when dealing with vast numbers of compounds in various forms, such as names, identifiers, or SMILES strings, can present significant logistical and technical challenges.

Methods: To streamline this process, we developed PyComp, a software tool using Python's PyQt5 library, and compiled it into an executable with Pyinstaller. PyComp provides a systematic way for users to retrieve and convert a list of compound names, IDs (even in a range), or SMILES strings into the desired 3D format.

Results: PyComp greatly enhances the efficiency of data extraction, conversion, and storage processes involved in VS. It searches for similar compounds coupled with its ability to handle misidentified compounds and offers users an easy-to-use, customizable tool for managing largescale compound data. By streamlining these operations, PyComp allows researchers to save significant time and effort, thus accelerating the pace of drug discovery research.

Conclusion: PyComp effectively addresses some of the most pressing challenges in highthroughput VS: efficient management and conversion of large volumes of compound data. As a user-friendly, customizable software tool, PyComp is pivotal in improving the efficiency and success of large-scale drug screening efforts, paving the way for faster discovery of potential therapeutic compounds.

Keywords: High-throughput Screening; Pharmaceutical compounds; PubChem; Virtual Screening.