AMBIT-SMARTS: Efficient Searching of Chemical Structures and Fragments

Mol Inform. 2011 Aug;30(8):707-20. doi: 10.1002/minf.201100028. Epub 2011 Aug 4.

Abstract

We present new developments in the AMBIT open source software package for efficient searching of chemical structures and structural fragments. AMBIT-SMARTS is a Java based software built on top of The Chemistry Development Kit. The AMBIT-SMARTS parser implements the entire SMARTS language specification with several syntax extensions that enable support for custom modifications introduced by third party software packages such as OpenEye, MOE and OpenBabel. The goal of yet another open-source SMARTS parser implementation is to achieve better performance and compatibility with multiple existing flavours of the SMARTS language, as well as to provide utilities for running efficient SMARTS queries in large structural databases. We describe a combination of approaches towards lowering the computational cost and improving the response time of substructure queries. An exhaustive comparison of the AMBIT algorithm with several subgraph isomorphism implementations is performed. To demonstrate the performance of the entire system from an end-user point of view, response time statistics for Web service substructure search queries against a database of 4.5 M structures are also reported. The package has wide applicability in the implementation of various chemoinformatics tasks. It has already been used in several projects dealing with descriptor calculation and predictive algorithms, database queries, web applications and web services.

Keywords: Benchmark; Database; SMARTS; Screening; Subgraph isomorphism; Substructure searching.