Artificial intelligence (AI) has achieved significant progress in the field of drug discovery. AI-based tools have been used in all aspects of drug discovery, including chemical structure recognition. We propose a chemical structure recognition framework, Optical Chemical Molecular Recognition (OCMR), to improve the data extraction capability in practical scenarios compared with the rule-based and end-to-end deep learning models. The proposed OCMR framework enhances the recognition performances via the integration of local information in the topology of molecular graphs. OCMR handles complex tasks like non-canonical drawing and atomic group abbreviation and substantially improves the current state-of-the-art results on multiple public benchmark datasets and one internally curated dataset.
Keywords: Bioinformatics; Chemical informatics; Chemical structure recognition; Molecular graph; OCMR; OCSR.
Copyright © 2023 Elsevier Ltd. All rights reserved.