An optimized relational database for querying structural patterns in proteins

Database (Oxford). 2024 Jan 17:2024:baad093. doi: 10.1093/database/baad093.

Abstract

A database is an essential component in almost any software system, and its creation involves more than just data modeling and schema design. It also includes query optimization and tuning. This paper focuses on a web system called GSP4PDB, which is used for searching structural patterns in proteins. The system utilizes a normalized relational database, which has proven to be inefficient even for simple queries. This article discusses the optimization of the GSP4PDB database by implementing two techniques: denormalization and indexing. The empirical evaluation described in the article shows that combining these techniques enhances the efficiency of the database when querying both real and artificial graph-based structural patterns.

MeSH terms

  • Databases, Factual
  • Software*