Multi-Query Optimization Revisited: A Full-Query Algebraic Method

Proc IEEE Int Conf Big Data. 2022 Dec:2022:252-261. doi: 10.1109/bigdata55660.2022.10020338. Epub 2023 Jan 26.

Abstract

Sharing data and computation among concurrent queries has been an active research topic in database systems. While work in this area developed algorithms and systems that are shown to be effective, there is a lack of logical foundation for query processing and optimization. In this paper, we present PsiDB, a system model for processing a large number of database queries in a batch. The key idea is to generate a single query expression that returns a global relation containing all the data needed for individual queries. For that, we propose the use of a type of relational operators called ψ-operators in combining the individual queries into the global expression. We tackle the algebraic optimization problem in PsiDB by developing equivalence rules to transform concurrent queries with the purpose of revealing query optimization opportunities. Centering around the ψ-operator, our rules not only cover many optimization techniques adopted in existing batch processing systems, but also revealed new optimization opportunities. Experiments conducted on an early prototype of PsiDB show a performance improvement of up to 36X over a mainstream commercial DBMS.

Keywords: Batch processing; Equivalence rules; Query optimization; Query processing.