Design and Implementation of Database Operator Cache for Select and Sort
Caching is a commonly used technique in databases to enhance query performance.Currently,existing database caching primarily falls into two directions:result caching and block caching.Result caching involves utilizing the final or intermediate results(such as subqueries)obtained during the execution of database queries,while block caching stores the underlying data blocks involved in the queries.This paper takes a different perspective,focusing on the'compu-tational load'contained within the cache,to reexamine the application of caching in query optimi-zation.Building on this,the paper further classifies database caching methods.During query exe-cution,a database query is transformed into a collection of operations(such as selection,sorting,etc.),with each operation corresponding to an operator.The data output by operators during query processing forms intermediate results,containing a portion of the computational load.We cache and leverage this subset of data.This caching approach,which caches a portion of the com-putational load,is termed'operator caching',specifically caching the results of each operation execution.Due to the potential presence of common operators across different queries,perform-ing similar computations on similar data,leveraging operator caching holds significant potential for accelerating query execution performance.The novelty of this paper lies in its exploration of how operator caching can be applied in query optimization,viewed from the perspective of the computational load contained in the cache.Using Filter and Sort operators as examples,we pro-pose and investigate how operator caching can be employed in query optimization.For cache re-use,we introduce a semantic tree-based matching algorithm designed to efficiently match result sets in the cache.Simultaneously,to address the potential degradation of query performance caused by reusing the cache,we suggest the use of a cost-based optimizer to prevent the degrada-tion of query performance.Finally,based on the open-source analytical database ClickHouse,this paper implements a prototype of the Filter and Sort operator caching and conducts extensive experimental testing of the proposed operator caching scheme.The results indicate that,com-pared to block caching and materialized view approaches,the operator caching solution proposed in this paper can achieve a maximum improvement of 9 times and 1.5 times in query response speed when deployed on a local SSD.When deployed in a cloud environment,the operator cac-hing approach can achieve respective improvements of 30 times and 2 times in query response speed.