小样本机器学习下数据多尺度挖掘算法设计

Design of Data Multi-Scale Mining Algorithm under Small Sample Machine Learning

刘云香 ¹同军红 ¹李穂丰 ²吴晓玲¹

扫码查看

作者信息

1. 广州商学院信息技术与工程学院,广东广州 511363
2. 广东外语外贸大学信息科学与技术学院,广东广州 510700
折叠

摘要

数据多尺度挖掘是指在数据挖掘过程中考虑不同尺度的数据信息,由于数据中存在大量特征,为了提高信息处理效率,且更精准地划分数据类型,提出一种小样本机器学习算法下数据多尺度挖掘方法.将使用者的部分动作抽象,建立网站页面会话用于学习不同的事件内容,获得小样本数据信息.通过复值函数构建Hibert空间,计算出样本元素再生核,提取小样本数据特征;利用特征向量构造特征矩阵调节数据间平衡性,得到数据相对熵.建立多尺度信息数据库,使用机器学习下逻辑回归离散化数据特征值,挖掘复杂项集指标的支持度,实现精准的数据多尺度挖掘.通过实验证明,所提方法数据分类效果好,挖掘准确率高,耗费时间少.

Abstract

Generally,multi-scale data mining refers to considering different scales of data information during the data mining process.In order to improve the efficiency of information processing and more accurately divide data types,this article put forward a multi-scale data mining method based on small sample machine learning algorithm.Firstly,partial actions were abstracted.Then,a website page session was generated for learning different events,thus obtaining small sample data information.Secondly,complex functions were used to build a Hibert space,and the re-producing kernel of the sample element was calculated to extract the characteristics of small sample data.Thirdly,a feature matrix was constructed by using feature vectors to adjust the balance between data,thereby obtaining the rela-tive entropy of data.Meanwhile,a multi-scale information database was built.Finally,the logical regression under machine learning was used to discretize the data feature value and mine the support degree of the indicator of the com-plex item set,thus achieving accurate multi-scale mining.Experiment results prove that the proposed method has good data classification effect,high mining accuracy,and less time consumption.

关键词

小样本机器学习算法/数据多尺度挖掘/相对熵值/特征矩阵/相似性效应

Key words

Small sample machine learning algorithm/Multi-scale data mining/Relative entropy/Characteristic matrix/Similarity effect

引用本文复制引用

出版年

2024

计算机仿真

中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD

影响因子：0.518

ISSN：1006-9348

参考文献量15

段落导航