首页|ZFT索引:基于分段线性回归的学习型多维索引

ZFT索引:基于分段线性回归的学习型多维索引

扫码查看
传统索引方式一般是一种通用的数据结构,不特别针对数据分布和特征设计或优化其索引方式,随着数据空间维度或数据量的增加,可能会导致存储消耗大且查询效率急剧下降。针对这些问题提出ZFT索引(Z-order Fiting-tree Index),它主要分为离线和在线两个部分。离线构造部分使用Z-order曲线将多维空间中的数据点映射到一维空间中,构建线性回归模型学习映射后数据的分布与特征;在线部分完成点查询和范围查询。实验结果表明,ZFT索引的空间效率和查询效率明显优于传统的R树以及UB树,并且在范围查询和模型训练速度上都优于ZM索引。
ZFT INDEX:LEARNING MULTI-DIMENSIONAL INDEX BASED ON PIECEWISE LINEAR REGRESSION
The traditional indexing method is generally a general data structure,which is not designed or optimized for data distribution and characteristics.With the increase of data space dimension or data volume,it may lead to large storage consumption and sharp decline of query efficiency.To solve these problems,this paper proposes a ZFT index(Z-order fitting tree index),which is mainly divided into offline and online parts.The offline construction part used Z-order curve to map data points in multidimensional space to one-dimensional space,and constructed a linear regression model to learn the distribution and characteristics of the mapped data.The online part completed the point query and range query.The experimental results show that the spatial efficiency and query efficiency of ZFT index are significantly better than those of traditional R tree and UB tree,and the speed of range query and model training is better than that of ZM index.

Multi-dimensional dataLearning indexZFT index

王小丽、陈华辉

展开 >

宁波大学信息科学与工程学院 浙江宁波 315211

多维数据 学习型索引 ZFT索引

国家自然科学基金项目

61572266

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(10)