首页|基于哈希桶和聚类的变半径邻域粗糙集模型

基于哈希桶和聚类的变半径邻域粗糙集模型

扫码查看
邻域粗糙集是处理机器学习与数据挖掘中不确定性的数据分析工具.邻域粗糙集中邻域粒的大小往往受邻域半径的影响.针对现有的邻域粗糙集模型通常对每个样本设置相同的邻域半径,导致得到的邻域粒无法对每个样本进行准确地刻画的问题,基于样本数据的分布信息,首先对数据集进行聚类,并基于哈希桶对每个类的样本分布情况做出分析,然后为每个样本设置合适大小的邻域半径,使其能够更准确地刻画每个样本的信息,进而提出变半径邻域粗糙集模型.最后选取了 8个UCI数据集进行实验,并分别与当前最常用的邻域粗糙集模型进行比较,理论分析与实验结果表明所提出的变半径邻域粗糙集模型具有更好的学习性能.
Variable radius neighborhood rough set model based on hash bucket and clustering
Neighborhood rough set is a data analysis tool that handles uncertainty in machine learning and data mining.The size of neighborhood granules in neighborhood rough set models is often affected by neighborhood ra-dius.However,existing neighborhood rough set models usually do not consider the distribution information of sample data,and set the same neighborhood radius for each sample,resulting in the neighborhood granules being unable to accurately depict each sample.To address this problem,based on the distribution information of data,a variable radius neighborhood rough set model is proposed.Firstly,the dataset is clustered,and the sample dis-tribution of each class is analyzed based on the hash bucket,and then the appropriate neighborhood radius is set for each sample,so that the information of each sample can be more accurately described.Finally,on eight data sets,the variable radius neighborhood rough set model is compared with popular neighborhood rough set models.Theoretical analysis and experimental results show that the variable radius neighborhood rough set model proposed in this paper has better learning performance.

variable neighborhood rough setshash bucketclusteringsample distributionuncertainty

李华、孟祥瑞

展开 >

石家庄铁道大学数理系 石家庄 050043

变半径邻域粗糙集 哈希桶 聚类 样本分布 不确定性

国家自然科学基金项目

61806133

2024

江苏科技大学学报(自然科学版)
江苏科技大学

江苏科技大学学报(自然科学版)

影响因子:0.373
ISSN:1673-4807
年,卷(期):2024.38(4)