计算机研究与发展2024,Vol.61Issue(7) :1754-1770.DOI:10.7544/issn1000-1239.202330507

基于查询编译的SQL执行技术研究进展

Advances in SQL Execution Techniques Based on Query Compilation

潘青峰 徐辰
计算机研究与发展2024,Vol.61Issue(7) :1754-1770.DOI:10.7544/issn1000-1239.202330507

基于查询编译的SQL执行技术研究进展

Advances in SQL Execution Techniques Based on Query Compilation

潘青峰 1徐辰2
扫码查看

作者信息

  • 1. 华东师范大学数据科学与工程学院 上海 200062
  • 2. 华东师范大学数据科学与工程学院 上海 200062;上海市大数据管理系统工程研究中心(华东师范大学) 上海 200062
  • 折叠

摘要

信息系统通常会借助数据管理系统来进行数据管理,其中SQL凭借良好的易用性和灵活性一直作为数据管理的主流查询语言,用户将编写的SQL语句交由数据管理系统执行后便可得到查询结果.执行模型的高效与否决定了系统能否快速响应用户的查询请求,现有执行模型主要采用解释执行和编译执行2种方式.解释执行具有良好的拓展性、可维护性等因而被大多数系统采用.不同于解释执行,编译执行为原本需要解释执行的查询生成高效的定制化代码来加速查询,带来的显著性能提升吸引了一众数据管理系统开始实现相应技术.然而,如何针对查询生成其对应的定制化代码是一个复杂的过程,在实现时需要考虑诸多方面,甚至在某些情况下,采用编译执行的查询性能可能还不及传统的火山模型.从概念、技术等角度系统地综述了编译执行技术的研究进展.首先,概述了编译执行的基本概念,对相关术语和背景知识进行了介绍;其次,分别从中间代码生成、中间表示、机器码生成与运行 3个角度介绍了相关技术;最后,结合当前数据管理系统的研究趋势以及近期研究工作展望了编译执行未来的发展方向.

Abstract

Information systems usually use data management systems to manage data,among which SQL has been the mainstream query language for data management because of its ease of use and flexibility,and users can write SQL statements and submit them to the data management system to get query results.The efficiency of the execution model determines whether the system can quickly respond to user queries.The existing execution models mainly adopt interpreted execution and compiled execution.Interpreted execution is used by most systems due to its scalability and maintainability.Unlike interpreted execution,compiled execution generates efficient custom code to speed up queries that should have been processed by interpreted execution,and the significant performance gains have attracted a number of database systems to implement the technology.However,generating the corresponding custom code for a query is a complex process that requires a number of considerations,even in some cases,the performance of using compiled execution may not be as good as the traditional volcano model.We provide a systematic review of the progress of compiled execution techniques from conceptual and technical perspectives.Firstly,we outline the basic concepts of query compilation and introduce the relevant terminology and background knowledge.Secondly,we introduce the relevant techniques from three perspectives:intermediate code generation,intermediate representation,machine code generation and running.Finally,we look at the future development direction of compiled execution technology in the context of current research trends in data management systems and recent research work.

关键词

数据管理系统/查询执行/代码生成/编译器/即时编译

Key words

data management system/query execution/code generation/compiler/just-in-time compilation

引用本文复制引用

基金项目

上海市自然科学基金项目(23ZR1419900)

出版年

2024
计算机研究与发展
中国科学院计算技术研究所 中国计算机学会

计算机研究与发展

CSTPCD北大核心
影响因子:2.649
ISSN:1000-1239
段落导航相关论文