基于BiLSTM的NL2SQL模型

NL2SQL MODEL BASED ON BILSTM

邰伟鹏 ¹刘杨 ²王小林 ¹郑啸 ²钟亮³

扫码查看

作者信息

1. 安徽工业大学计算机科学与技术学院安徽马鞍山 243000;安徽工业大学信息技术学院安徽马鞍山 243000
2. 安徽工业大学计算机科学与技术学院安徽马鞍山 243000
3. 上海量镜空间信息技术有限公司上海 200000
折叠

摘要

随着互联网技术的发展,很多应用为大众提供金融量化服务,而大部分用户不具备金融或计算机专业知识,他们期望使用自然语言查询数据,因此自然语言转SQL(NL2SQL)被迫切需要.针对此问题,提出一种基于双向长短期记忆模型(BiLSTM)的中文金融NL2SQL算法,分为编码和解码阶段.在编码阶段,利用BiLSTM和注意力机制生成特征向量.在解码阶段,根据SQL的语法规则,将SQL生成解耦为九个分类任务,各个任务间相互依赖联合学习,之后生成复杂的SQL语句.除模型外,还训练出包含金融词汇的向量库,构建金融领域的数据集.通过在此数据集上实验验证,结果表明,该方法准确率更高,能有效解决金融领域SQL生成问题,并在某金融量化分析系统中实现.

Abstract

With the development of Internet technology,many applications provide financial quantification services for the public,but most users do not have financial or computer professional knowledge,they expect to use natural language to query data,so natural language to SQL(NL2SQL)is urgently needed.To solve this problem,a Chinese financial NL2SQL algorithm based on BiLSTM is proposed,which is divided into encoding and decoding stages.In the encoding stage,feature vectors were generated by BiLSTM and attention mechanism.In the decoding stage,the SQL generation was decoupled into nine classified tasks according to the SQL syntax rules,and each task was interdependent and joint learning,and then the complex SQL statement was generated.In addition to the model,a vector library containing financial vocabulary was trained,which built data sets for the financial domain.The experimental verification on this data set shows that the method has higher accuracy,can effectively solve the problem of SQL generation in the financial field,and is implemented in a financial quantitative analysis system.

关键词

NL2SQL/BiLSTM/注意力机制/向量库/数据集

Key words

NL2SQL/BiLSTM/Attention mechanism/Vector library/Dat set

引用本文复制引用

基金项目

安徽省高等学校自然科学研究重大项目(KJ2019ZD09)

安徽省重点研发计划(202004a07020028)

安徽省高等学校协同创新项目(GXXT-2019-025)

出版年

2024

计算机应用与软件

上海市计算技术研究所上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心

影响因子：0.615

ISSN：1000-386X

参考文献量17

段落导航