湖北师范大学学报(自然科学版)2024,Vol.44Issue(3) :20-24.DOI:10.3969/j.issn.2096-3149.2024.03.004

基于分步降维的高维学习索引研究

Research on high-dimensional learned index based on stepwise dimensionality reduction

刘进军 徐政 乔凯 方振益
湖北师范大学学报(自然科学版)2024,Vol.44Issue(3) :20-24.DOI:10.3969/j.issn.2096-3149.2024.03.004

基于分步降维的高维学习索引研究

Research on high-dimensional learned index based on stepwise dimensionality reduction

刘进军 1徐政 1乔凯 1方振益1
扫码查看

作者信息

  • 1. 湖北师范大学 计算机与信息工程学院,湖北 黄石 435000
  • 折叠

摘要

在数据量和复杂性不断增加的时代,文本、音频和图像等高维数据的数量显著增长,这些数据的利用也变得更加频繁.因此,设计和实现一种高效的高维索引结构变得至关重要.基于降维的索引已被证明可以提高高维数据的查询效率.然而,随着数据量的增加,这些技术不可避免地会遇到诸如查询效率降低和内存使用增加之类的问题.为解决此问题,提出一种基于降维的高维学习索引,通过分步降维的方式,将高维数据降维为有序一维数据,并以此训练学习索引模型.在合成和真实数据集上的几个实验表明,该索引结构可以有效地提升查询效率及减少内存占用.

Abstract

In an era of dramatic increase in data volume and complexity,the amount of high-dimensional data such as text,audio,and images has grown significantly,and the utilization of these data has become more frequent.Therefore,it is of great importance to design and implement an efficient high-dimensional index structure.Studies has been proved that indexes based on dimensionality reduction can improve the query efficiency of high-dimensional data.However,as the amount of data increases,it is inevitable for these techniques to face challenges such as reduced query efficiency and increased memory usage.To solve this problem,this paper proposes a high-dimensional learned index based on dimensionality reduction,which reduces the dimensionality of high-dimensional data to ordered one-dimensional data by stepwise dimensionality reduction,and then trains the learned index.Experiments on synthetic and real-life datasets show that this index structure can effectively improve query efficiency and reduce memory consumption.

关键词

机器学习/降维/学习索引

Key words

machine learning/dimensionality reduction/learned index

引用本文复制引用

出版年

2024
湖北师范大学学报(自然科学版)
湖北师范学院

湖北师范大学学报(自然科学版)

影响因子:0.376
ISSN:2096-3149
段落导航相关论文